Convert Byte[64] array to minimum length string

Convert Byte[64] array to minimum length string - c#

I have try to generate unlock key like XXXX-XXXX-XXXX or simply small length string or Hexstring. I am using RSA algorithm to encrypt and decrypt the Key. I got some long string like
Q65g2+uiytyEUW5SFsiI/c5z9NSxyuU2CM1SEly6cAVv9PdTpH81XaWS8lITcaTZ4IjdmINwhHBosvt5kdg==
when I convert the byte array (array size is 64 byte) using the below convert method.
Convert.ToBase64String(bytes);
My requirement is to generate the minimal length Key. Is there any way to convert the Byte array (array size is 64 byte) to minimal length and I need that back to byte array or any other suggestions (to minimize the string length) would be helpful.
I have tried to convert the output string to Hex decimal, but the output is too long than the string.

You may want to take a look at What is the most efficient way to encode an arbitrary GUID into readable ASCII (33-127)? There the Base 85 encoding is discussed which is used to compress PDF files.
Though, the difference between Base64 and Base85 in your case is 8 characters.
You can safely remove trailing '==' in Base64 string because it is used for alignment and will always be there for 64-byte values (Of course you will have to add these characters back to decode the string).

Since you mention you want users to be able to type in the string,
there will be an inverse correlation between easy-of-use from point of view of users and the length of string.
Even typing a Base64 string is prone to lot of errors. Base32 strings are much easier to type, but correspondingly the length will increase.
If the users can Copy-Paste the key, then the above is moot and there should not be any valid reason why the length of the string should be as small as possible.

Obviously, you can only fit a certain amount of data into a fixed number of characters. You have pretty much maxed out the limit with base64 already which gives you 6 bits per byte.
Therefore you need to reduce the amount of data that needs to be stored. Can you reduce the key length? You could use a 96 bit key (by always leaving all other bytes zero). That would require 16 base64 characters which is much better.
It seems you don't need much security against brute forcing. So you can reduce the key size even further.

Related

BigInteger.Parse() trouble reading in large numbers

Presently I am attempting to do this challenge (http://cryptopals.com/sets/1/challenges/1) and I am having some trouble completing the task in C#. I can not seem to parse the number into a big integer.
So code looks like below:
string output = "";
BigInteger hexValue = BigInteger.Parse("49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6");
output = Convert.ToBase64String(hexValue.ToByteArray());
Console.WriteLine(hexValue);
Console.WriteLine(output);
Console.ReadKey();
return "";
And at present the problem I am getting is when I run the program it fails with the error
System.FormatException: 'The value could not be parsed.' and I am not entirely sure why.
So, what is the appropriate way to get a large integer from a string into a BigInt?

The initial problem
The BigInteger.Parse method expects the value to be decimal, not hex. You can "fix" that by passing in NumberStyles.HexNumber.
The bigger problem with using BigInteger for this
If you're just trying to convert a string of hex digits into bytes, I would avoid using BigInteger at all. For one thing, you could end up with problems if the original byte array started with zeroes, for example. The zeroes wouldn't be in the resulting byte array. (Sample input: "0001" - you want to get two bytes out, but you'll only get one, after persuading it to parse hex.)
Even if you don't lose any information, the byte[] you receive from BigInteger.ToByteArray() isn't what you were probably expecting. For example, consider this code, which just converts the data to byte[] and back to hex via BitConverter:
BigInteger bigInt = BigInteger.Parse("1234567890ABCDEF", NumberStyles.HexNumber);
byte[] bytes = bigInt.ToByteArray();
Console.WriteLine(BitConverter.ToString(bytes));
The output of that is "EF-CD-AB-90-78-56-34-12" - because BigInteger.ToByteArray returns the data in little-endian order:
The individual bytes in the array returned by this method appear in little-endian order. That is, the lower-order bytes of the value precede the higher-order bytes.
That's not what you want - because it means the last part of the original string is the first part of the byte array, etc.
Avoiding BigInteger altogether
Instead, parse the data directly to a byte array, as in this question, or this one, or various others. I won't reproduce the code here, but it's simple enough, with different options depending on whether you're trying to create simple source code or an efficient program.
General advice on conversions
In general it's a good idea to avoid intermediate representations of data unless you're absolutely convinced that you won't lose information in the process - as you would here. It's fine to convert the hex string to a byte array before converting the result to base64, because that's not a lossy transformation.
So your conversions are:
String (hex) to BigInteger: lossy (in the context of leading 0s being significant, as they are in this situation)
BigInteger to byte[]: not lossy
byte[] to String (base64): not lossy
I'm recommending:
String (hex) to byte[]: not lossy (assuming you have an even number of nybbles to convert, which is generally a reasonable assumption)
byte[] to String (base64): not lossy

Use NumberStyles.HexNumber:
BigInteger.Parse("49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6",
NumberStyles.HexNumber,
CultureInfo.InvariantCulture);
If your number is supposed to be always positive, add a leading zero to your string.

The problem is that the input is not decimal but hexadecimal, therefore you need to pass an additional parameter for parsing:
BigInteger number = BigInteger.Parse(
hexString,
NumberStyles.AllowHexSpecifier);

Generate a printable HMAC Shared key in .Net

I'm using HMACSHA512 to hash data using a shared key. Since the key is shared I'd like for it to be all printable characters for ease of transport. I'm wondering what the best approach is to generating these keys.
I'm currently using the GetBytes() method of RNGCryptoServiceProvider to generate a key, but the byte array it returns contains non-printable characters. So I'm wondering if it is secure to base64 encode the result or does that erode the randomness too much and make things much less secure? If that isn't a good approach can you suggest one?
I do understand that by limiting the keys to printable characters I am limiting the overall breadth of the key space (ie: lopping off 1 of the 8 bits), but I am OK with that.

If you can handle not auto-generating the key then http://www.grc.com/passwords is a good source of VERY random key material.
Base64 wouldn't reduce the underlying entropy of the byte array. You could generate the key and use it in its raw form, but Base64 encode it to transport it to where you need it to be. You'd then Base64 decode it back to the raw form before you use it in the new location. There is no loss of entropy in this operation. The Base64 encoding reduces the entropy to 6-bits per byte instead of 8, but the result of the coding is longer, so overall the entropy is the same.
The other way you could do it would be to get 24 random bytes for 192-bits worth of entropy. Base64 encoding this would give you a 32 character string (256-bits) which still has the original randomness and 192-bits of entropy. You could use this as your shared key directly.

BASE64 transforms a byte sequence so it uses only certain printable characters.
This transformation does not change the information in any way, just how it is stored. It is also reversible: you can get the original byte sequence by decoding the BASE64 output.
So using BASE64 does not "erode the randomness" or limit the key space in any way.

How many bytes will a string take up?

Can anyone tell me how many bytes the below string will take up?
string abc = "a";

From my article on strings:
In the current implementation at least, strings take up 20+(n/2)*4 bytes (rounding the value of n/2 down), where n is the number of characters in the string. The string type is unusual in that the size of the object itself varies. The only other classes which do this (as far as I know) are arrays. Essentially, a string is a character array in memory, plus the length of the array and the length of the string (in characters). The length of the array isn't always the same as the length in characters, as strings can be "over-allocated" within mscorlib.dll, to make building them up easier. (StringBuilder does this, for instance.) While strings are immutable to the outside world, code within mscorlib can change the contents, so StringBuilder creates a string with a larger internal character array than the current contents requires, then appends to that string until the character array is no longer big enough to cope, at which point it creates a new string with a larger array. The string length member also contains a flag in its top bit to say whether or not the string contains any non-ASCII characters. This allows for extra optimisation in some cases.
I suspect that was written before I had a chance to work with a 64-bit CLR; I suspect in 64-bit land each string takes up either 4 or 8 more bytes.
EDIT: I wrote up a blog post more recently which includes 64-bit information (and contradicts the above slightly for x86...)

Basically, Each string object require a constant 20 bytes for the object data.
The buffer requires 2 bytes per character.
The memory usage estimation for string in bytes: 20 + (2 * Length).
So, Normally The memory in CLR for this string: 22 bytes
However while we pass or sending this string to another end or in any other usage, we do not need this much memory(we never need the 20 bytes for the object data). So it depends on the type of encoding you select, while you use it.
For a default encoding, it will take 1 byte for a character.
So Answer is 1 byte for default encoding.
You can check with this code:
Encoding.Default.GetBytes("a"); //It will give you a byte array of size 1.
Encoding.Default.GetBytes("ABC"); //It will give you a byte array of size 3.

If you ask about size of string object then it is wrong to ask about its size, without debugger it is impossible to say what exactly is it. Not sure that it is possible with debugger either. string uses pointers internally.
If you ask about size of sequence of chars that it contains then it is 4, because strings are stored in UTF-16. All chars in Basic Multilingual Plane are coded with two bytes.

Getting a string, int, etc in binary representation?

Is it possible to get strings, ints, etc in binary format? What I mean is that assume I have the string:
"Hello" and I want to store it in binary format, so assume "Hello" is
11110000110011001111111100000000 in binary (I know it not, I just typed something quickly).
Can I store the above binary not as a string, but in the actual format with the bits.
In addition to this, is it actually possible to store less than 8 bits. What I am getting at is if the letter A is the most frequent letter used in a text, can I use 1 bit to store it with regards to compression instead of building a binary tree.

Is it possible to get strings, ints,
etc in binary format?
Yes. There are several different methods for doing so. One common method is to make a MemoryStream out of an array of bytes, and then make a BinaryWriter on top of that memory stream, and then write ints, bools, chars, strings, whatever, to the BinaryWriter. That will fill the array with the bytes that represent the data you wrote. There are other ways to do this too.
Can I store the above binary not as a string, but in the actual format with the bits.
Sure, you can store an array of bytes.
is it actually possible to store less than 8 bits.
No. The smallest unit of storage in C# is a byte. However, there are classes that will let you treat an array of bytes as an array of bits. You should read about the BitArray class.

What encoding would you be assuming?

What you are looking for is something like Huffman coding, it's used to represent more common values with a shorter bit pattern.
How you store the bit codes is still limited to whole bytes. There is no data type that uses less than a byte. The way that you store variable width bit values is to pack them end to end in a byte array. That way you have a stream of bit values, but that also means that you can only read the stream from start to end, there is no random access to the values like you have with the byte values in a byte array.

What I am getting at is if the letter
A is the most frequent letter used in
a text, can I use 1 bit to store it
with regards to compression instead of
building a binary tree.
The algorithm you're describing is known as Huffman coding. To relate to your example, if 'A' appears frequently in the data, then the algorithm will represent 'A' as simply 1. If 'B' also appears frequently (but less frequently than A), the algorithm usually would represent 'B' as 01. Then, the rest of the characters would be 00xxxxx... etc.
In essence, the algorithm performs statistical analysis on the data and generates a code that will give you the most compression.

You can use things like:
Convert.ToBytes(1);
ASCII.GetBytes("text");
Unicode.GetBytes("text");
Once you have the bytes, you can do all the bit twiddling you want. You would need an algorithm of some sort before we can give you much more useful information.

The string is actually stored in binary format, as are all strings.
The difference between a string and another data type is that when your program displays the string, it retrieves the binary and shows the corresponding (ASCII) characters.
If you were to store data in a compressed format, you would need to assign more than 1 bit per character. How else would you identify which character is the mose frequent?
If 1 represents an 'A', what does 0 mean? all the other characters?

Why does an encrypted byte array have a different length from its char[] representation?

I was working on some encryption/decryption algorithms and I noticed that the encrypted byte[] arrays always had a length of 33, and the char[] arrays always had a length of 44. Does anyone know why this is?
(I'm using Rijndael encryption.)

Padding and text encoding. Most encryption algorithms have a block size, and input needs to be padded up to a multiple of that block size. Also, turning binary data into text usually involves the Base64 algorithm, which expands 3 bytes into 4 characters.

That's certainly not true for all encryption algorithms, it must just be a property of the particular one you're using. Without knowing what algorithm it is, I can only guess, but the ratio 33/44 suggests that the algorithm might be compressing each character into 6 bits in the output byte array. That probably means it's making the assumption that no more than 64 distinct characters are used, which is a good assumption for plain text (in fact, that's how base64 decoding works).
But again, without knowing what algorithm you're using, this is all guesswork.

Without knowing the encryption you're using, its a little tough to determine the exact cause. To start, here's an article on How to Calculate the Size of Encrypted Data. It sounds like you might be using a hash of your plaintext, which is why the result is shorter.
Edit: Heres the source for a Rijndael Implementation. It looks like the ciphertext output is initially the same length as the plaintext input, and then they do a base64 on it, which, as the previous poster mentioned, would reduce your final output to 3/4 of the original input.

No, no idea at all, but my first thought would be that your encryption algorithm is built such that it removes 1 bit per 10 from the output data.
Only you can know for sure since we cannot see you code from out here :-)

It would be a pretty lousy encryption algorithm if it was just replacing bytes one-for-one. That was state of the art 50 years ago, and it didn't work very well even then. :)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Convert Byte[64] array to minimum length string - c#

Related

BigInteger.Parse() trouble reading in large numbers

Generate a printable HMAC Shared key in .Net

How many bytes will a string take up?

Getting a string, int, etc in binary representation?

Why does an encrypted byte array have a different length from its char[] representation?

Categories

Resources