Can someone explain how BCrypt verifies a hash?

Can someone explain how BCrypt verifies a hash? - c#

I'm using C# and BCrypt.Net to hash my passwords.
For example:
string salt = BCrypt.Net.BCrypt.GenerateSalt(6);
var hashedPassword = BCrypt.Net.BCrypt.HashPassword("password", salt);
//This evaluates to True. How? I'm not telling it the salt anywhere, nor
//is it a member of a BCrypt instance because there IS NO BCRYPT INSTANCE.
Console.WriteLine(BCrypt.Net.BCrypt.Verify("password", hashedPassword));
Console.WriteLine(hashedPassword);
How is BCrypt verifying the password with the hash if it's not saving the salt anywhere. The only idea I have is that it's somehow appending the salt at the end of the hash.
Is this a correct assumption?

A BCrypt hash string looks like:
$2a$10$Ro0CUfOqk6cXEKf3dyaM7OhSCvnwM9s4wIX9JeLapehKK5YdLxKcm
\__/\/ \____________________/\_____________________________/
| | Salt Hash
| Cost
Version
Where
2a: Algorithm Identifier (BCrypt, UTF8 encoded password, null terminated)
10: Cost Factor (210 = 1,024 rounds)
Ro0CUfOqk6cXEKf3dyaM7O: OpenBSD-Base64 encoded salt (22 characters, 16 bytes)
hSCvnwM9s4wIX9JeLapehKK5YdLxKcm: OpenBSD-Base64 encoded hash (31 characters, 24 bytes)
Edit: i just noticed these words fit exactly. i had to share:
$2a$10$TwentytwocharactersaltThirtyonecharacterspasswordhash
$==$==$======================-------------------------------
BCrypt does create a 24-byte binary hash, using 16-byte salt. You're free to store the binary hash and the salt however you like; nothing says you have to base-64 encode it into a string.
But BCrypt was created by guys who were working on OpenBSD. OpenBSD already defines a format for their password file:
$[HashAlgorithmIdentifier]$[AlgorithmSpecificData]
This means that the "bcrypt specification" is inexorably linked to the OpenBSD password file format. And whenever anyone creates a "bcrypt hash" they always convert it to an ISO-8859-1 string of the format:
$2a$[Cost]$[Base64Salt][Base64Hash]
A few important points:
2a is the algorithm identifier
1: MD5
2: early bcrypt, which had confusion over which encoding passwords are in (obsolete)
2a: current bcrypt, which specifies passwords as UTF-8 encoded
Cost is a cost factor used when computing the hash. The "current" value is 10, meaning the internal key setup goes through 1,024 rounds
10: 210 = 1,024 iterations
11: 211 = 2,048 iterations
12: 212 = 4,096 iterations
the base64 algorithm used by the OpenBSD password file is not the same Base64 encoding that everybody else uses; they have their own:
Regular Base64 Alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
BSD Base64 Alphabet: ./ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
So any implementations of bcrypt cannot use any built-in, or standard, base64 library
Armed with this knowledge, you can now verify a password correctbatteryhorsestapler against the saved hash:
$2a$12$mACnM5lzNigHMaf7O1py1O3vlf6.BA8k8x3IoJ.Tq3IB/2e7g61Km
BCrypt variants
There is a lot of confusion around the bcrypt versions.
$2$
BCrypt was designed by the OpenBSD people. It was designed to hash passwords for storage in the OpenBSD password file. Hashed passwords are stored with a prefix to identify the algorithm used. BCrypt got the prefix $2$.
This was in contrast to the other algorithm prefixes:
$1$: MD5
$5$: SHA-256
$6$: SHA-512
$2a$
The original BCrypt specification did not define how to handle non-ASCII characters, or how to handle a null terminator. The specification was revised to specify that when hashing strings:
the string must be UTF-8 encoded
the null terminator must be included
$2x$, $2y$ (June 2011)
A bug was discovered in crypt_blowfish🕗, a PHP implementation of BCrypt. It was mis-handling characters with the 8th bit set.
They suggested that system administrators update their existing password database, replacing $2a$ with $2x$, to indicate that those hashes are bad (and need to use the old broken algorithm). They also suggested the idea of having crypt_blowfish emit $2y$ for hashes generated by the fixed algorithm. Nobody else, including canonical OpenBSD, adopted the idea of 2x/2y. This version marker was was limited to crypt_blowfish🕗.
The versions $2x$ and $2y$ are not "better" or "stronger" than $2a$. They are remnants of one particular buggy implementation of BCrypt.
$2b$ (February 2014)
A bug was discovered in the OpenBSD implementation of BCrypt. They wrote their implementation in a language that doesn't have support strings - so they were faking it with a length-prefix, a pointer to a character, and then indexing that pointer with []. Unfortunately they were storing the length of their strings in an unsigned char. If a password was longer than 255 characters, it would overflow and wrap at 255. BCrypt was created for OpenBSD. When they have a bug in their library, they decided its ok to bump the version. This means that everyone else needs to follow suit if you want to remain current to "their" specification.
http://undeadly.org/cgi?action=article&sid=20140224132743 🕗
http://marc.info/?l=openbsd-misc&m=139320023202696 🕗
There is no difference between 2a, 2x, 2y, and 2b. If you wrote your implementation correctly, they all output the same result.
If you were doing the right thing from the beginning (storing strings in utf8 and also hashing the null terminator) then: there is no difference between 2, 2a, 2x, 2y, and 2b. If you wrote your implementation correctly, they all output the same result.
The version $2b$ is not "better" or "stronger" than $2a$. It is a remnant of one particular buggy implementation of BCrypt. But since BCrypt canonically belongs to OpenBSD, they get to change the version marker to whatever they want.
The versions $2x$ and $2y$ are not better, or even preferable, to anything. They are remnants of a buggy implementation - and should summarily forgotten.
The only people who need to care about 2x and 2y are those you may have been using crypt_blowfish back in 2011. And the only people who need to care about 2b are those who may have been running OpenBSD.
All other correct implementations are identical and correct.

How is BCrypt verifying the password with the hash if it's not saving the salt anywhere?
Clearly it is not doing any such thing. The salt has to be saved somewhere.
Let's look up password encryption schemes on Wikipedia. From http://en.wikipedia.org/wiki/Crypt_(Unix) :
The output of the function is not merely the hash: it is a text string which also encodes the salt and identifies the hash algorithm used.
Alternatively, an answer to your previous question on this subject included a link to the source code. The relevant section of the source code is:
StringBuilder rs = new StringBuilder();
rs.Append("$2");
if (minor >= 'a') {
rs.Append(minor);
}
rs.Append('$');
if (rounds < 10) {
rs.Append('0');
}
rs.Append(rounds);
rs.Append('$');
rs.Append(EncodeBase64(saltBytes, saltBytes.Length));
rs.Append(EncodeBase64(hashed,(bf_crypt_ciphertext.Length * 4) - 1));
return rs.ToString();
Clearly the returned string is version information, followed by the number of rounds used, followed by the salt encoded as base64, followed by the hash encoded as base64.

Related

C# AES-256 Unicode Key

I need to make strong key for AES-256 in a) Unicode characters, b) key in bytes.
a) I have to generate 50 random Unicode characters and then convert them to bytes. Is this possible to use Unicode characters as AES256 key?
For e.g. I want to use this page to create password.
is there any way to import all characters from Windows characters table to program in Windows Form App?
b) I'm using this code:
System.Security.Cryptography.AesCryptoServiceProvider key = new System.Security.Cryptography.AesCryptoServiceProvider();
key.KeySize = 256;
key.GenerateKey();
byte[] AESkey = key.Key;
It's enough or I should change something?
Also I have one more question. Making an AES key longer then 43 ASCII characters will be more secure or it will be anyway hashed to 256bit? And there is difference between ASCII key of 43 characters and 100?

a) I have to generate 50 random Unicode characters and then convert them to bytes. Is this possible to use Unicode characters as AES256 key?
Yes, this is possible. Since you have plenty of space for characters you can just encode it. ceil(32 / 3) * 4 = 44, so you'd have enough characters for this. You would not be using the additional space provided by Unicode encoding though. Obviously you would need to convert it back to binary before using it.
b) is aes.GenerateKey "enough"?
Yes, aes.GenerateKey is enough to generate a binary AES key.
c) Making an AES key longer then 43 ASCII characters will be more secure or it will be anyway hashed to 256bit? And there is difference between ASCII key of 43 characters and 100?
An AES key is not hashed at all. It's just 128, 192 or 256 bits (i.e. 16, 24 or 32 bytes) of data that should be indistinguishable from random (to somebody that doesn't know the value, of course). If you want to hash something you'd have to do it yourself - but please read on.
The important thing to understand is that a password is not a key, and that keys for modern ciphers are almost always encoded as binary. For AES there is no such thing as an ASCII key. If you need to encode the key, use base 64.
If you want to use a password then you need to use a key derivation function or KDF. Furthermore, if you want to protect against dictionary and rainbow table attacks you will want to use a password based key derivation function or PBKDF. Such a KDF is also called a "password hash". In case of .NET your best bet is Rfc2898DeriveBytes which implements PBKDF2. PBKDF2 is defined in the RFC
2898 titled: PKCS #5: Password-Based Cryptography Specification Version 2.0 which you may want to read.

If I replace a character in an MD5 hash, does that increase the possibility of collisions?

We're generating hashes to provide identifiers for documents being stored in RavenDB. We're doing this as there is a limit on the length of the DocumentID (127 characters - ESent limitation) if you want to use BulkInsert like so:
_documentStore.BulkInsert(options: new BulkInsertOptions { CheckForUpdates = true }))
In order for the BulkInsert to work, the DocumentID needs to match the row being upserted; so we want an DocumentID that can be regenerated from the same source string consistently.
An MD5 hash will provide us a fixed length value with a low probability of collision, with the code used to generate the hash below:
public static string GetMD5Hash(string inputString)
{
HashAlgorithm algorithm = MD5.Create();
var hashBytes = algorithm.ComputeHash(Encoding.UTF8.GetBytes(inputString));
return Encoding.UTF8.GetString(hashBytes);
}
However; RavenDB does not support "\" in DocumentID; so I want to replace it with "/". However my fear is that in doing so we are increasing the likelihood of a hashing conflict.
Code I want to change to:
public static string GetMD5Hash(string inputString)
{
HashAlgorithm algorithm = MD5.Create();
var hashBytes = algorithm.ComputeHash(Encoding.UTF8.GetBytes(inputString));
return Encoding.UTF8.GetString(hashBytes).Replace('\\', '"');
}
Will this increase the likelihood of hash conflicts and remove our ability to depend on the DocumentID as "unique"?

X-Y problem - instead of converting byte array into version that is known to be correctly handled as string with Base64 (or similar) you using UTF8 as encoding.
Reading random byte array as UTF8 string will have non-printable and 0 characters as well random failures due to incorrect UTF8 sequences.
Use Base64 (or base32 if need case insensitive string). If some characters still not supported - replace with other unique ones. I.e. URL-friendly base64 uses -, _ and no padding to simplify encoding as query parameter.
To original question:
hash of any kind can't be considered "unique ID" for document due to possibility of collisions.
yes replacing one character with another that already could be used in the string will decrease number of possible combinations and increase possibility of collision. I can't estimate it properly - math or statistics specific question may be needed if you really need precise answer.

You increase the probability of collision, but only slightly. All "/" in the output hash are like 'wildcards' which match either "/" or "\" in the raw hash. If you have zero of these in a hash, nothing changes. If you have one of these in a hash, there are now twice as many documents that can match that hash. If you have two in a hash, there are four times as many. Having many more is unlikely given the alphabet and the length of the MD5 hash.
The probability of a collision is still pretty small (unless you have a huge number of documents, etc).
However, you should do what was suggested in comments and use a Base64 or HEX string to store the MD5.
Bad things can happen in cryptography when you 'roll your own' and try and modify protocols which you don't have an inside-out understanding of. You should always always stick to doing standard things which have been tested theoretically and in practice and found to be reasonable. Bruce Schneier puts across this principle at length in Practical Cryptography and elsewhere.

Use Base64 instead of UTF8 and you will solve your problem (no more /).
Have a look at Convert.ToBase64String.

RNGCryptoServiceProvider and Zeros?

walking through some cryptogtaphy stuff , I saw that RNGCryptoServiceProvider has 2 methods :
link
RNGCryptoServiceProvider.GetNonZeroBytes
and
RNGCryptoServiceProvider.GetBytes
And so I ask :
What is odd with Filling an array of bytes with a cryptographically strong sequence of random value which some (0 or more) of them are zeros ?
(it is random values and apparently there wont be many zeros , and still zero is also a regular number)
why did they created the distinguishing ?

Within the .NET framework, GetNonZeroBytes(byte[]) is used when generating PKCS#1 padding for RSA encryption, which uses 0x00 as a seperator.
Using a tool like Reflector, you can see it used in RSAPKCS1KeyExchangeFormatter.CreateKeyExchange(byte[]) to implement padding as per RFC 2313, section 8.1.2 (RFC 3218 has some nice ASCII art that demonstrates the byte layout more clearly).
GetNonZeroBytes(byte[]) could also be used to generate salt. The Cryptography StackExchange site has a similar question which suggests that avoiding 0x00 is to help with libraries and APIs that may treat the salt as a zero-terminated string, which would accidentally truncate the salt. However, unless one is using P/Invoke, this is unlikely to be a concern in .NET.

How to hash a password with SHA512

In my previous question I was told to hash passwords instead of encrypt, and that turned out to be correct. Problem is, I've never dealt with hashing passwords before and all the docs say SHA512 which I've tried to use on a test account to no avail. I'm not sure where to go from here. The code comments give me the example "encrypted" string as they call it, and it's "FA35A0194E3BE7024CEFB1839CBFC922" which I'm not sure how to format it like that with SHA512 since all it takes and gives back is a byte array or stream from the ComputeHash() method:
byte[] hashedPassword = HashAlgorithm.Create("SHA512").ComputeHash( ??? );
UPDATE
I've tried printing out the UTF8Encoding.GetString on the bytes, but it just displays a bunch of bullshit characters that look nothing like the one in the example docs.

Hashing with plain SHA-512 is still wrong. Use PBKDF2 which is exposed via Rfc2898DeriveBytes.
It returns raw bytes, which you should encode with either hex or base64.
You can do hex encoding with:
BitConverter.ToString(bytes).Replace("-","")

You sure it said 512 because that's 128, but anyway you could use something like
System.String Hashed = System.BitConverter.ToString(((System.Security.Cryptography.SHA512)new System.Security.Cryptography.SHA512Managed()).ComputeHash(System.Text.Encoding.ASCII.GetBytes("NotHashedPass"))).Replace("-","");
MessageBox.Show(Hashed);
but id recommend at least using a salt.

Please see tutorial here:
http://www.obviex.com/samples/hash.aspx
From the tutorial:
"These code samples demonstrate how to hash data and verify hashes. It supports several hashing algorithms. To help reduce the risk of dictionary attacks, the code prepends random bytes (so-called salt) to the original plain text before generating hashes and appends them to the generated ciphertext (original salt value will be needed for hash verification). The resulting ciphertext is base64-encoded. IMPORTANT: DATA HASHES CANNOT BE DECRYPTED BACK TO PLAIN TEXT"

Format-preserving Encryption sample

I want to encrypt/decrypt digits into string (with only digits and/or upper characters) with the same length using Format-preserving Encryption. But I don't find implementation steps. So, can anyone please provide WORKING sample for C# 2.0?
For an example,
If I encrypt fixed length plaintext like 99991232 (with or without fixed key) then the cipher should be like 23220978 or ED0FTS. If the length of encrypted string is less than plain text then also it would be all right. But cipher text length must not be greater than plain text and the cipher text must of of fixed length.

From your question I assume that the plain text is numeric, where the cipher text could be alphanumeric. Due to this it is quite easy to make an encoding scheme. This makes your format preservation less stringent and this can be taken advantage of (this won't work if your plain text is also alphanumeric).
First, find a power of 2 that is greater than the number of discrete values that you have, for example, in the numeric case you have 10 discrete values - so you would use 16 (2 ^ 4). Create a 'BaseX' encoding scheme for this (in this case Base16) and decode the plain text to binary using it.
Thus given the plain text:
1, 2, 3, 4
We encode it to:
0001-0010 0011-0100
You can then run this through your length-preserving cipher (one example of a length-preserving cipher is AES in counter mode). Say you get the following value back:
1001-1100 1011-1100
Encode this using your 'BaseX' encoder, and in our case we would get:
9, C, B, C
Which is the same length. I threw together a sample for you (bit large to paste here).

As Henk said, "Format Preserving Encryption" is not defined. I can think of two possible answers:
Use AES and convert the cyphertext byte array to a hex string or to Base64.
Use a simple Vigenère cipher just replacing the characters you want to replace.
You need to specify your requirement more clearly.
ETA: You do not say how secure you need this to be. Standard Vigenère is not secure against any sort of strong attack, but will be safe from casual users. Vigenère can be made absolutely secure, but that requires as much true random key material as there is plaintext to encypher, and is usually impractical.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.