RNGCryptoServiceProvider and Zeros?

RNGCryptoServiceProvider and Zeros? - c#

walking through some cryptogtaphy stuff , I saw that RNGCryptoServiceProvider has 2 methods :
link
RNGCryptoServiceProvider.GetNonZeroBytes
and
RNGCryptoServiceProvider.GetBytes
And so I ask :
What is odd with Filling an array of bytes with a cryptographically strong sequence of random value which some (0 or more) of them are zeros ?
(it is random values and apparently there wont be many zeros , and still zero is also a regular number)
why did they created the distinguishing ?

Within the .NET framework, GetNonZeroBytes(byte[]) is used when generating PKCS#1 padding for RSA encryption, which uses 0x00 as a seperator.
Using a tool like Reflector, you can see it used in RSAPKCS1KeyExchangeFormatter.CreateKeyExchange(byte[]) to implement padding as per RFC 2313, section 8.1.2 (RFC 3218 has some nice ASCII art that demonstrates the byte layout more clearly).
GetNonZeroBytes(byte[]) could also be used to generate salt. The Cryptography StackExchange site has a similar question which suggests that avoiding 0x00 is to help with libraries and APIs that may treat the salt as a zero-terminated string, which would accidentally truncate the salt. However, unless one is using P/Invoke, this is unlikely to be a concern in .NET.

Related

How to obfuscate an integer?

From a list of integers in C#, I need to generate a list of unique values. I thought in MD5 or similar but they generates too many bytes.
Integer size is 2 bytes.
I want to get a one way correspondence, for example
0 -> ARY812Q3
1 -> S6321Q66
2 -> 13TZ79K2
So, proving the hash, the user cannot know the integer or to interfere a sequence behind a list of hashes.
For now, I tried to use MD5(my number) and then I used the first 8 characters. However I found the first collision at 51389. Which other alternatives I could use?
As I say, I only need one way. It is not necessary to be able to calculate the integer from the hash. The system uses a dictionary to find them.
UPDATE:
Replying some suggestions about using GetHashCode(). GetHashCode returns the same integer. My purpose is to hide to the end user the integer. In this case, the integer is the primary key of a database. I do not want to give this information to users because they could deduce the number of records in the database or the increment of records by week.
Hashes are not unique, so maybe I need to use encryption like TripleDes or so, but I wanted to use something fast and simple. Also, TripleDes returns too many bytes too.
UPDATE 2:
I was talking about hashes and it is an error. In reality, I am trying to obfuscate it, and I tried it using hash algorithm, that it is not a good idea because they are not unique.

Update May 2017
Feel free to use (or modify) the library I developed, installable via Nuget with:
Install-Package Kent.Cryptography.Obfuscation
This converts a non-negative id such as 127 to 8-character string, e.g. xVrAndNb, and back (with some available options to randomize the sequence each time it's generated).
Example Usage
var obfuscator = new Obfuscator();
string maskedID = obfuscator.Obfuscate(15);
Full documentation at: Github.
Old Answer
I came across this problem way back and I couldn't find what I want in StackOverflow. So I made this obfuscation class and just shared it on github.
Obfuscation.cs - Github
You can use it by:
Obfuscation obfuscation = new Obfuscation();
string maskedValue = obfuscation.Obfuscate(5);
int? value = obfuscation.DeObfuscate(maskedValue);
Perhaps it can be of help to future visitor :)

Encrypt it with Skip32, which produces a 32 bit output. I found this C# implementation but can't vouch for its correctness. Skip32 is a relatively uncommon crypto choice and probably hasn't been analyzed much. Still it should be sufficient for your obfuscation purposes.
The strong choice would be format preserving encryption using AES in FFX mode. But that's pretty complicated and probably overkill for your application.
When encoded with Base32 (case insensitive, alphanumeric) a 32 bit value corresponds to 7 characters. When encoded in hex, it corresponds to 8 characters.
There is also the non cryptographic alternative of generating a random value, storing it in the database and handling collisions.

Xor the integer. Maybe with a random key that it is generated per user (stored in session). While it's not strictly a hash (as it is reversible), the advantages are that you don't need to store it anywhere, and the size will be the same.

For what you want, I'd recommend using GUIDs (or other kind of unique identifier where the probability of collision is either minimal or none) and storing them in the database row, then just never show the ID to the user.
IMHO, it's kind of bad practice to ever show the primary key in the database to the user (much less to let users do any kind of operations on them).
If they need to have raw access to the database for some reason, then just don't use ints as primary keys, and make them guids (but then your requirement loses importance since they can just access the number of records)
Edit
Based on your requirements, if you don't care the algorithm is potentially computationally expensive, then you can just generate a random 8 byte string every time a new row is added, and keep generating random strings until you find one that is not already in the database.
This is far from optimal, and -can- be computationally expensive, but taking you use a 16-bit id and the maximum number of rows is 65536, I'd not care too much about it (the possibility of an 8 byte random string to be in a 65536 possibility list is minimal, so you'll probably be good at first or as much as second try, if your pseudo-random generator is good).

Format-preserving Encryption sample

I want to encrypt/decrypt digits into string (with only digits and/or upper characters) with the same length using Format-preserving Encryption. But I don't find implementation steps. So, can anyone please provide WORKING sample for C# 2.0?
For an example,
If I encrypt fixed length plaintext like 99991232 (with or without fixed key) then the cipher should be like 23220978 or ED0FTS. If the length of encrypted string is less than plain text then also it would be all right. But cipher text length must not be greater than plain text and the cipher text must of of fixed length.

From your question I assume that the plain text is numeric, where the cipher text could be alphanumeric. Due to this it is quite easy to make an encoding scheme. This makes your format preservation less stringent and this can be taken advantage of (this won't work if your plain text is also alphanumeric).
First, find a power of 2 that is greater than the number of discrete values that you have, for example, in the numeric case you have 10 discrete values - so you would use 16 (2 ^ 4). Create a 'BaseX' encoding scheme for this (in this case Base16) and decode the plain text to binary using it.
Thus given the plain text:
1, 2, 3, 4
We encode it to:
0001-0010 0011-0100
You can then run this through your length-preserving cipher (one example of a length-preserving cipher is AES in counter mode). Say you get the following value back:
1001-1100 1011-1100
Encode this using your 'BaseX' encoder, and in our case we would get:
9, C, B, C
Which is the same length. I threw together a sample for you (bit large to paste here).

As Henk said, "Format Preserving Encryption" is not defined. I can think of two possible answers:
Use AES and convert the cyphertext byte array to a hex string or to Base64.
Use a simple Vigenère cipher just replacing the characters you want to replace.
You need to specify your requirement more clearly.
ETA: You do not say how secure you need this to be. Standard Vigenère is not secure against any sort of strong attack, but will be safe from casual users. Vigenère can be made absolutely secure, but that requires as much true random key material as there is plaintext to encypher, and is usually impractical.

Can someone explain how BCrypt verifies a hash?

I'm using C# and BCrypt.Net to hash my passwords.
For example:
string salt = BCrypt.Net.BCrypt.GenerateSalt(6);
var hashedPassword = BCrypt.Net.BCrypt.HashPassword("password", salt);
//This evaluates to True. How? I'm not telling it the salt anywhere, nor
//is it a member of a BCrypt instance because there IS NO BCRYPT INSTANCE.
Console.WriteLine(BCrypt.Net.BCrypt.Verify("password", hashedPassword));
Console.WriteLine(hashedPassword);
How is BCrypt verifying the password with the hash if it's not saving the salt anywhere. The only idea I have is that it's somehow appending the salt at the end of the hash.
Is this a correct assumption?

A BCrypt hash string looks like:
$2a$10$Ro0CUfOqk6cXEKf3dyaM7OhSCvnwM9s4wIX9JeLapehKK5YdLxKcm
\__/\/ \____________________/\_____________________________/
| | Salt Hash
| Cost
Version
Where
2a: Algorithm Identifier (BCrypt, UTF8 encoded password, null terminated)
10: Cost Factor (210 = 1,024 rounds)
Ro0CUfOqk6cXEKf3dyaM7O: OpenBSD-Base64 encoded salt (22 characters, 16 bytes)
hSCvnwM9s4wIX9JeLapehKK5YdLxKcm: OpenBSD-Base64 encoded hash (31 characters, 24 bytes)
Edit: i just noticed these words fit exactly. i had to share:
$2a$10$TwentytwocharactersaltThirtyonecharacterspasswordhash
$==$==$======================-------------------------------
BCrypt does create a 24-byte binary hash, using 16-byte salt. You're free to store the binary hash and the salt however you like; nothing says you have to base-64 encode it into a string.
But BCrypt was created by guys who were working on OpenBSD. OpenBSD already defines a format for their password file:
$[HashAlgorithmIdentifier]$[AlgorithmSpecificData]
This means that the "bcrypt specification" is inexorably linked to the OpenBSD password file format. And whenever anyone creates a "bcrypt hash" they always convert it to an ISO-8859-1 string of the format:
$2a$[Cost]$[Base64Salt][Base64Hash]
A few important points:
2a is the algorithm identifier
1: MD5
2: early bcrypt, which had confusion over which encoding passwords are in (obsolete)
2a: current bcrypt, which specifies passwords as UTF-8 encoded
Cost is a cost factor used when computing the hash. The "current" value is 10, meaning the internal key setup goes through 1,024 rounds
10: 210 = 1,024 iterations
11: 211 = 2,048 iterations
12: 212 = 4,096 iterations
the base64 algorithm used by the OpenBSD password file is not the same Base64 encoding that everybody else uses; they have their own:
Regular Base64 Alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
BSD Base64 Alphabet: ./ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
So any implementations of bcrypt cannot use any built-in, or standard, base64 library
Armed with this knowledge, you can now verify a password correctbatteryhorsestapler against the saved hash:
$2a$12$mACnM5lzNigHMaf7O1py1O3vlf6.BA8k8x3IoJ.Tq3IB/2e7g61Km
BCrypt variants
There is a lot of confusion around the bcrypt versions.
$2$
BCrypt was designed by the OpenBSD people. It was designed to hash passwords for storage in the OpenBSD password file. Hashed passwords are stored with a prefix to identify the algorithm used. BCrypt got the prefix $2$.
This was in contrast to the other algorithm prefixes:
$1$: MD5
$5$: SHA-256
$6$: SHA-512
$2a$
The original BCrypt specification did not define how to handle non-ASCII characters, or how to handle a null terminator. The specification was revised to specify that when hashing strings:
the string must be UTF-8 encoded
the null terminator must be included
$2x$, $2y$ (June 2011)
A bug was discovered in crypt_blowfish🕗, a PHP implementation of BCrypt. It was mis-handling characters with the 8th bit set.
They suggested that system administrators update their existing password database, replacing $2a$ with $2x$, to indicate that those hashes are bad (and need to use the old broken algorithm). They also suggested the idea of having crypt_blowfish emit $2y$ for hashes generated by the fixed algorithm. Nobody else, including canonical OpenBSD, adopted the idea of 2x/2y. This version marker was was limited to crypt_blowfish🕗.
The versions $2x$ and $2y$ are not "better" or "stronger" than $2a$. They are remnants of one particular buggy implementation of BCrypt.
$2b$ (February 2014)
A bug was discovered in the OpenBSD implementation of BCrypt. They wrote their implementation in a language that doesn't have support strings - so they were faking it with a length-prefix, a pointer to a character, and then indexing that pointer with []. Unfortunately they were storing the length of their strings in an unsigned char. If a password was longer than 255 characters, it would overflow and wrap at 255. BCrypt was created for OpenBSD. When they have a bug in their library, they decided its ok to bump the version. This means that everyone else needs to follow suit if you want to remain current to "their" specification.
http://undeadly.org/cgi?action=article&sid=20140224132743 🕗
http://marc.info/?l=openbsd-misc&m=139320023202696 🕗
There is no difference between 2a, 2x, 2y, and 2b. If you wrote your implementation correctly, they all output the same result.
If you were doing the right thing from the beginning (storing strings in utf8 and also hashing the null terminator) then: there is no difference between 2, 2a, 2x, 2y, and 2b. If you wrote your implementation correctly, they all output the same result.
The version $2b$ is not "better" or "stronger" than $2a$. It is a remnant of one particular buggy implementation of BCrypt. But since BCrypt canonically belongs to OpenBSD, they get to change the version marker to whatever they want.
The versions $2x$ and $2y$ are not better, or even preferable, to anything. They are remnants of a buggy implementation - and should summarily forgotten.
The only people who need to care about 2x and 2y are those you may have been using crypt_blowfish back in 2011. And the only people who need to care about 2b are those who may have been running OpenBSD.
All other correct implementations are identical and correct.

How is BCrypt verifying the password with the hash if it's not saving the salt anywhere?
Clearly it is not doing any such thing. The salt has to be saved somewhere.
Let's look up password encryption schemes on Wikipedia. From http://en.wikipedia.org/wiki/Crypt_(Unix) :
The output of the function is not merely the hash: it is a text string which also encodes the salt and identifies the hash algorithm used.
Alternatively, an answer to your previous question on this subject included a link to the source code. The relevant section of the source code is:
StringBuilder rs = new StringBuilder();
rs.Append("$2");
if (minor >= 'a') {
rs.Append(minor);
}
rs.Append('$');
if (rounds < 10) {
rs.Append('0');
}
rs.Append(rounds);
rs.Append('$');
rs.Append(EncodeBase64(saltBytes, saltBytes.Length));
rs.Append(EncodeBase64(hashed,(bf_crypt_ciphertext.Length * 4) - 1));
return rs.ToString();
Clearly the returned string is version information, followed by the number of rounds used, followed by the salt encoded as base64, followed by the hash encoded as base64.

Making smaller a string, c#

I need a library/tool/function that compresses a 50-60 char long string to smaller.
Do you know any?

Effective compression on that scale will be difficult. You might consider Huffman coding. This might give you smaller compression than gzip (since it will result in binary codes instead of a base-85 sequence).

Are you perhaps thinking of a cryptographic hash? For example, SHA-1 (http://en.wikipedia.org/wiki/SHA-1) can be used on an input string to produce a 20-byte digest. Of course, the digest will always be 20 bytes - even if the input string is shorter than 20 bytes.

The framework includes the GZipStream and DeflateStream classes. But that might not really be what you are after - what input strings have to be compressed? ASCII only? Letters only? Alphanumerical string? Full Unicode? And what are allowed output strings?
From an algorithmic stand point and without any further knowledge of the space of possible inputs I suggest to use arithmetic coding. This might shrink the compressed size by a few additional bits compared to Huffman coding because it is not restricted to an integral number of bits per symbol - something that can turn out important when dealing with such small inputs.

If your string only contains lowercase characters between a-z and 0-9 you could encode it in 7bits.
This will compress a 60 char string to 53 bytes. If you don't need digits you could use 6bits instead, bringing it down to 45 bytes.
So choosing the right compression method depends on what data your string contains.

You could simply gzip it
http://www.example-code.com/csharp/gzip_compressString.asp

I would use some basic like RLE or shared dictionary based compression followed by a block cipher that keeps the size constant.
Maybe smaz is also interesting for you.
Examples of basic compression algorithms:
RLE
(Modified or not) Huffman coding
Burrows-Wheeler transformation
Examples of block ciphers ("bit twiddlers"):
AES
Blowfish
DES
Triple DES
Serpent
Twofish
You will be able to find out what fullfills your needs using wikipedia (links above).

Why does an encrypted byte array have a different length from its char[] representation?

I was working on some encryption/decryption algorithms and I noticed that the encrypted byte[] arrays always had a length of 33, and the char[] arrays always had a length of 44. Does anyone know why this is?
(I'm using Rijndael encryption.)

Padding and text encoding. Most encryption algorithms have a block size, and input needs to be padded up to a multiple of that block size. Also, turning binary data into text usually involves the Base64 algorithm, which expands 3 bytes into 4 characters.

That's certainly not true for all encryption algorithms, it must just be a property of the particular one you're using. Without knowing what algorithm it is, I can only guess, but the ratio 33/44 suggests that the algorithm might be compressing each character into 6 bits in the output byte array. That probably means it's making the assumption that no more than 64 distinct characters are used, which is a good assumption for plain text (in fact, that's how base64 decoding works).
But again, without knowing what algorithm you're using, this is all guesswork.

Without knowing the encryption you're using, its a little tough to determine the exact cause. To start, here's an article on How to Calculate the Size of Encrypted Data. It sounds like you might be using a hash of your plaintext, which is why the result is shorter.
Edit: Heres the source for a Rijndael Implementation. It looks like the ciphertext output is initially the same length as the plaintext input, and then they do a base64 on it, which, as the previous poster mentioned, would reduce your final output to 3/4 of the original input.

No, no idea at all, but my first thought would be that your encryption algorithm is built such that it removes 1 bit per 10 from the output data.
Only you can know for sure since we cannot see you code from out here :-)

It would be a pretty lousy encryption algorithm if it was just replacing bytes one-for-one. That was state of the art 50 years ago, and it didn't work very well even then. :)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.