We introduced password encryption to our site.
The salt is calculated as shown below:
Rfc2898DeriveBytes hasher = new Rfc2898DeriveBytes(Username.ToLowerInvariant(),
System.Text.Encoding.Default.GetBytes("Wn.,G38uI{~6y8G-FA4);UD~7u75%6"), 10000);
string salt = Convert.ToBase64String(hasher.GetBytes(25));
For most usernames the salt is always the same.
But for some usernames it changes at every call.
Can someone tell me what we are doing wrong?
Assuming you're using RFC2898DeriveBytes to hash the password itself as well, then #CodesInChaos is correct, what you're doing wrong is:
Building the salt based off the username, instead of using a cryptographic PRNG to generate a fresh salt for each user.
You should use something like the .NET RNGCryptoServiceProvider Class to generate 8 to 16 (binary) bytes of random salt
For instance, from Rfc2898DeriveBytes Example 1
byte[] salt1 = new byte[8];
using (RNGCryptoServiceProvider rngCsp = new RNGCryptoServiceProvider())
{
// Fill the array with a random value.
rngCsp.GetBytes(salt1);
}
The salt should then be stored in the clear in your database alongside the password hash and iteration count (so you can change it), and probably a version code too (so you can change it again, i.e. your current calculated salt method is version 1, and the random salt is version 2).
Spending 20,000 iterations of PBKDF2 on the salt, rather than spending it on the actual password hash!
10,000 iterations for the first 20 bytes, since RFC2898DeriveBytes is PBKDF2-HMAC-SHA-1, and SHA-1 has a native 20 byte output
10,000 more iteration for the next 20 bytes, which is then truncated to only the 5 you need to get to a 25 byte output.
This is a weakness, as the defender has to spend the time on the salt on every login, whether it's spent on the salt, or the password hashing. The attacker has to spent that time once for each username, and then they are going to store the results and try _illions (where _ is very large) of password guesses.
Thus, the attacker has a greater than normal marginal advantage because they can precalculate the salt, while you have to calculate it on the fly.
If you aren't using RFC2898DeriveBytes, another PBKDF2 implementation, BCrypt, or SCrypt to do the actual password hashing, then that's what you're doing wrong.
Trimming the username some, but not all of the time is entirely incidential; just make sure not to trim passwords before they're hashed.
Related
I want to hash passwords before storing them to the database. There are many samples out there on how to hash passwords, the following C# code from the docs relies on the HMACSHA1 algorithm:
public static void Main(string[] args)
{
Console.Write("Enter a password: ");
string password = Console.ReadLine();
// generate a 128-bit salt using a secure PRNG
byte[] salt = new byte[128 / 8];
using (var rng = RandomNumberGenerator.Create())
{
rng.GetBytes(salt);
}
Console.WriteLine($"Salt: {Convert.ToBase64String(salt)}");
// derive a 256-bit subkey (use HMACSHA1 with 10,000 iterations)
string hashed = Convert.ToBase64String(KeyDerivation.Pbkdf2(
password: password,
salt: salt,
prf: KeyDerivationPrf.HMACSHA1,
iterationCount: 10000,
numBytesRequested: 256 / 8));
Console.WriteLine($"Hashed: {hashed}");
}
I would like to know if there is a way to determine the length of the password hash based on the length of the password. So if the user has a password with a length of x is there a way to calculate the length of the hash?
I want to know it because in my database the password column currently is a varchar taking 128 characters. The REST API built on top of it should restrict the password length so that the database will never crash because the password is too long and generates a password hash longer than 128 characters.
Or is it best practise to say the database column is a varchar of 256 characters and the API only allows passwords smaller or equal than 30 characters so it will never hit the limit?
It would be nice if the answer is independent from the code language, this is more a question in general.
The output of PBKDF2 can be specified. A PBKDF is a password based key derivation function. Generally those have a key expansion phase that allows the output to be specified.
However, if PBKDF2 is used as password hash rather than for key derivation the size of the configured hash is kept; that provides the maximum security that can be retrieved from the algorithm. In this case that's SHA-1 that generates 160 bits / 20 bytes.
Unless you really need text, the output can be stored as static binary of 20 bytes. In your case you should be storing it as base 64 version of the 20 bytes. That should amount to a fixed 28 bytes: ((20 + 2) / 3) * 4 = 28 to calculate the base 64 expansion. However, your code explicitly specifies the output size to be 256 / 8 = 64 bytes. A quick calculation suggests that it always uses 88 base 64 characters for that size.
Producing 64 bytes while using SHA-1 is not a good setting because it requires the inner function of PBKDF2 to run 4 times, giving you no advantage of running it only once to produce 20 bytes, giving advantage to an attacker. An attacker only has to check the first 20 bytes to make sure a password matches, after all, and for that only one of the four runs is required. The method that PBKDF2 uses to expand the key size over the hash size is really inefficient and may be considered a design flaw.
On the other hand, 10.000 iterations is not very high. You should, for PBKDF2:
specify the output size of the underlying hash as output size (20 bytes instead of 64 bytes for SHA-1) and
use a higher number of iterations (limited by how much CPU time you can spend in PBKDF2).
The size of the password doesn't have any influence on the size of the password hash.
Beware that some password hashes on other runtimes create a password hash string themselves, more compatible with crypt on Unix systems. So they would have a larger output that is not directly compatible.
I'm trying to use PBKDF2 in C# to create a password, then I'm trying to retrieve that password.
var masterPwd = "masterPassword";
var service = "www.google.com";
byte[] salt = CreateSalt(16);
var encodedPwd = CreateMasterPassword(masterPwd, salt);
var decoded = CreateMasterPassword(encodedPwd, salt);
With the following functions defined:
public static byte[] CreateSalt(int size)
{
var salt = new byte[size];
using (var random = new RNGCryptoServiceProvider())
{
random.GetNonZeroBytes(salt);
}
return salt;
}
public static string CreateMasterPassword(string password, byte[] salt)
{
string PassHash = Convert.ToBase64String(KeyDerivation.Pbkdf2(
password: password,
salt: salt,
prf: KeyDerivationPrf.HMACSHA256,
iterationCount: 10000,
numBytesRequested: 256 / 8));
return PassHash;
}
In this case, shouldn't decoded be the same as masterPwd?
I think you have a bit of a misunderstanding about what PBKDF2 does. It is not an encryption function where you can ever recover the plaintext data (let's put brute force aside as it is not an 'intended use'). Rather, it is a "slow" hashing mechanism, often described as "one way".
PBKDF2 is a key derivation function, but is also used for storing passwords. Here's a typical flow for PBKDF2 when used for password storage.
A user creates an account with a website with a password. The site generates a random salt, then applies PBKD2 to the password with the salt, and stores the result and the salt. The salt is stored in plain text.
When the user needs to log in again, the site asks for the username and password. It looks up the salt for that user, then it re-applied PBKDF2 to the password the user entered.
It compares the stored hash with the hash of what the user entered. If the hashes are equal, the site knows they typed the password correctly.
This approach means the site does not store the password in a way that it can possibly know. This allows the site to disavow knowledge of the password.
If that is what you want to do, then that is how you should use it.
If you do need a way to have a "two way" algorithm, then this goes from hashing to encryption. A symmetric algorithm would be used in this place, with all of the troublesome issues of key and IV management. You would most likely want to take a look at a high abstraction that is built on top of symmetric ciphers like libsodium.
libsodium is a nice abstraction built on top of primitives that takes the guess work out of how to use them. If offers simple APIs such as "encrypt this thing with this password" and it correctly derives an encryption key from the password, performs some form of authentication on the encryption, and is regarded well by information security experts.
I've been reading about securing users passwords in the database (https://crackstation.net/hashing-security.htm). The basic idea is understood - generate a random Salt, append it to the password and hash the password.
So here's what I did (I didn't put here some methods that do conversion to strings):
RandomNumberGenerator randomNumberGenerator = RandomNumberGenerator.Create();
byte[] rndBytes = new byte[512];
randomNumberGenerator.GetBytes(rndBytes);
string salt = ToHexString(rndBytes);
var sha512Hasher = SHA512.Create();
string hashedPwd = ToHexString(sha512Hasher.ComputeHash(GetBytes(pwd + salt)))
According to the article this is secured but can be even more secured by using "key stretching" which for my understanding is hashing that done slower (using a parameter) to make brute-force the password harder.
So here's what I did:
RandomNumberGenerator randomNumberGenerator = RandomNumberGenerator.Create();
byte[] salt = new byte[512];
randomNumberGenerator.GetBytes(salt);
Rfc2898DeriveBytes k1 = new Rfc2898DeriveBytes(user.Password, salt, 1000);
byte[] hashBytes = k1.GetBytes(512);
string hash = ToHexString(hashBytes);
Now here are my questions:
What is the difference between SHA512 and Rfc2898DeriveBytes? which is more secure?
Should I have smaller salt with more iterations? Will it make it more secure?
On a 1000 iterations it runs very fast - how slow should it be? half a second? a second? What is the rule of thumb here?
On the database - should I convert the byte array to string and store strings or should I store the byte array in a binary data field?
Edit (another questions)
If I iterate a 1000 times over rehashing SHA512 - does it give the same security?
What is the difference between SHA512 and Rfc2898DeriveBytes?
SHA512 is a cryptographic hash function, while Rfc2898DeriveBytes is a key-derivation function. As you already wrote, hash functions are too fast and can be brute-forced too easily, that's why we need functions with a cost factor like BCrypt, SCrypt, PBKDF2 or Argon2. As far as i know, Rfc2898DeriveBytes implements the PBKDF2 using a HMAC with SHA1. This answers your other question that an iterated SHA is less secure than Rfc2898DeriveBytes.
Should I have smaller salt with more iterations?
Salt and cost factor are not related and have different purposes. The salt prevents the usage of rainbow tables, the iterations are a counter measure for brute-force attacks. More infos you can get from my tutorial about safe password storage. So no, don't make the salt shorter.
how slow should it be?
Of course this depends on your server and your requirements for security, slower means harder to brute-force. A rule of thumb is about 50 milliseconds for a single hash.
On the database - should I convert the byte array to string?
This is up to you. Strings are easier to handle for backups, migration and debugging, while byte arrays need less space in the database. Maybe you should also have a look at BCrypt.Net, it generates strings as output which contain the salt and are easy to store in a single database field [string].
I'm using Rfc2898DeriveBytes to securely generate encryption key and initialization vector from user-supplied string password, to use with symmetric encryption (e.g. AesManaged).
I'm taking the SHA1 hash of password as a salt parameter to Rfc2898DeriveBytes. Is that ok? If not, then where should I get the salt from? I will need the same salt when decrypting, right? So I have to store it somewhere unencrypted - unsecured. If I have to store it securely, then it just becomes another "password", isn't it?
void SecureDeriveKeyAndIvFromPassword(string password, int iterations,
int keySize, int ivSize, out byte[] key, out byte[] iv)
{
// Generate the salt from password:
byte[] salt = (new SHA1Managed()).ComputeHash(Encoding.UTF8.GetBytes(password));
// Derive key and IV bytes from password:
Rfc2898DeriveBytes derivedBytes = new Rfc2898DeriveBytes(password, salt, iterations);
key = derivedBytes.GetBytes(keySize);
iv = derivedBytes.GetBytes(ivSize);
}
I've seen using the constant (hard-coded) salt, and I've seen people complaining about it. I thought deriving salt from password would be the better idea, but I'm not sure this is an optimal solution.
Shortly, I have a file that needs to be encrypted, and password string input by user. How do I properly use Rfc2898DeriveBytes to derive secure encryption key and IV?
Thanks.
EDIT:
Thanks for your answers. I now understand that the main (maybe only?) purpose of salt is to make generation of rainbow tables impossible - you can't pre-generate the hash of "P#$$w0rd" because it will have a different hash for each possible salt value. I understand this perfectly, BUT... Is this really relevant to symmetric encryption? I'm not storing the hash anywhere right? So even if the attacker has the rainbow table for all possible password combinations, he can't do much, right?
So, my question now is: Is there any advantage of using the random salt in each encryption operation, compared to using password-derived (or even hard-coded) salt, when used with symmetric encryption algorithms (like AesManaged of .NET)?
A salt should be unique for each password, that means create a random password for every password you want to hash. The salt is not a secret and can be stored plain text with your calculated hash-value.
The idea of the salt is, that an attacker cannot use a prebuilt rainbowtable, to get the passwords. He would have to build such a rainbowtable for every password separately, and this doesn't make sense. It's easier to brute-force, until you found a match.
There is an example in MSDN where the salt is gotten from the random source of the operating system. This is the best you can do, to get a safe salt, do not derrive it from your password.
A salt is designed to protect against multi-target attacks by making each target behave differently. Rainbow tables are just one particular incarnation of multi-target attacks, where the computational effort is expended before you obtain the targets.
There are situations where multi-target attacks are applicable, but rainbow tables are not.
One example of this: Assume you're using an authenticated encryption scheme with semantic security, such as AES-GCM with unique nonces. Now you've obtained a million different messages encrypted using different password.
If you use no salt, to check if a password applies to any one of these, the attacker needs one KDF operation, and one million decryption operations. If you use a salt, the attacker needs one million KDF operations and one million decryption operations. Since the KDF is slow compared to the decryption, an attack against the first scheme is much faster than an attack on the second scheme.
I don't really know what is Rfc2898DeriveBytes but I can tell you the following: salt doesn't has to be secured. Now, you said you have seen people complaining about hard-coded, constant values for salt, and whoever said that is right. Salt should be a random value, never a constant one, otherwise its purpose is defeated.
Do you understand what salt is used for? You clearly don't. Using the hash as salt is a bad idea because password X will always be salted with the same value Y, again, defeating its purpose.
People dislike hard-coded salts as they are accessible to all developers of the project (and possibly the public in the case of open source projects or reverse engineering). Attackers can then compute rainbow tables for your particular salt and start attacking your system.
A good choice of salt value is something that:
Is available each time you check the password
Doesn't change between password checks
Differs for each (or most) password calculations
A username would then be a decent choice, provided it cannot change. Or, generate a completely random value when you first create the user and store that as the salt, along with the user data in your database.
Others already explained the purpose of the salt, and why it can be public information.
One additional part of the answer to your question: do not derive the salt from the password itself. That would be very similar to the programming blunder that ended up exposing millions of passwords after the Ashley Madison hack.
The problem is that when you use the password to generate the salt, you are essentially making the password available in a second, and much-easier-to-crack, form. The attacker can completely ignore the output of the PBKDF2, and simply reverse the salt itself. That is only protected with SHA1, which is already broken.
At Ashley Madison, the error was similar. The passwords were stored in the main database using bcrypt, and thought to be secure. But then somebody discovered that the passwords for many accounts were actually stored twice, and the second copy was only protected with MD5.
I want to store login and password in sqlite database. This database in encrypted using SQLCipher library. But password to encrypt database is separate issue. This password is stored in code of application. Login and password are provided by user to login to application. In C# there is the SHA256 class. If I use this class if it is enough ? Or rather I should use hash and salt or other methods ?
Thanks
To store a user password in a database for login matters, you should use a hash function with a salt.
SHA 256 is one of them, but there are better ones existing. I recommend you using the PBKDF2 derivative function. You can implement your own PBKDF2 hashing method using the Rfc2898DeriveBytes class provided in the .NET framework.
Here is a quick how-to-do-it:
int saltSize = 256; // Number of bytes of the salt
int iterations = 1000; // Number of times we iterate the function
// The more we iterate, the more it is gonna take time.
// The advantage of a great iterations number is to
// make brutforce attack more painful.
int hashSize = 20; // Number of bytes of the hash (the output)
var deriveBytes = new Rfc2898DeriveBytes("mypassword", saltSize, iterations);
byte[] salt = deriveBytes.Salt;
byte[] hash = deriveBytes.GetBytes(hashSize);
You just have now to store the salt and the hash in your database. Use Convert.FromBase64String and Convert.ToBase64String to get a string from a byte[] and vice-versa.
Another alternative is to use bcrypt. See this interesting article.