Generating random string valid for UTF-8 encode and decode - c#

For testing purpose, I need to generate a random string, which is then encoded into byte array for transferring over the Web and decoded back to a result string. The test uses NUnit framework to compare the original string with the result string. Since the encoded byte array has to be friendly for Web, it is encoded with UTF-8.
The string is encoded into a byte array by Encoder.GetBytes from UTF8Encoding. The byte array is decoded to string by Decoder.GetChars from UTF8Encoding.
The original string needs to be generated randomly and contain any sequence of characters, which can be encoded/decoded using UTF-8 encoding.
My first attempt to generate the string was:
public static String RandomString(Random rnd, Int32 length) {
StringBuilder str = new StringBuilder(length);
for (int i = 0; i < length; i++)
str.Append((char)rnd.Next(char.MinValue, char.MaxValue));
return str.ToString();
}
The above code produces strings with invalid sequences to encode.
I found some suggestions on the web and improved the code:
public static String RandomString(Random rnd, Int32 length) {
StringBuilder str = new StringBuilder(length);
for (int i = 0; i < length; i++) {
char c = (char)rnd.Next(char.MinValue, char.MaxValue);
while (c >= 0xD800 && c <= 0xDFFF)
c = (char)rnd.Next(char.MinValue, char.MaxValue);
str.Append(c);
return str.ToString();
}
The above code has no problem with encoding, but decoding the byte array fails. Furthermore, I am not sure that the code can cover all possible cases.
Any suggestions, how to generate a random string with the given requirements in C#.
UPD: using a random string in encoding/decoding:
public static Encoder Utf8Encode = new UTF8Encoding(false, true).GetEncoder();
public static Decoder Utf8Decode = new UTF8Encoding(false, true).GetDecoder();
public unsafe void TestString(Random rnd, int length, byte* byteArray,
int arrayLenght) {
int encodedLen;
String str = RandomString(rnd, length);
fixed (char* pStr = str) {
encodedLen = Utf8Encode.GetBytes(pStr, str.Length, byteArray,
arrayLenght, true);
}
char* buffer = stackalloc char[8192];
int decodedLen = Utf8Decode.GetChars(byteArray, encodedLen, buffer,
8192, true);
String res = new String(buffer, 0, decodedLen);
Assert.AreEqual(str, res);
}

I have used the code below for generating random UTF-8 character byte sequences. I can't guarantee it captures every aspect of the UTF-8 spec, but it was valuable for my testing purposes, so I'm posting it here.
private static readonly (int, int)[] HeadByteDefinitions =
{
(1 << 7, 0b0000_0000),
(1 << 5, 0b1100_0000),
(1 << 4, 0b1110_0000),
(1 << 3, 0b1111_0000)
};
static byte[] RandomUtf8Char(Random gen)
{
const int totalNumberOfUtf8Chars = (1 << 7) + (1 << 11) + (1 << 16) + (1 << 21);
int tailByteCnt;
var rnd = gen.Next(totalNumberOfUtf8Chars);
if (rnd < (1 << 7))
tailByteCnt = 0;
else if (rnd < (1 << 7) + (1 << 11))
tailByteCnt = 1;
else if (rnd < (1 << 7) + (1 << 11) + (1 << 16))
tailByteCnt = 2;
else
tailByteCnt = 3;
var (range, offset) = HeadByteDefinitions[tailByteCnt];
var headByte = Convert.ToByte(gen.Next(range) + offset);
var tailBytes = Enumerable.Range(0, tailByteCnt)
.Select(_ => Convert.ToByte(gen.Next(1 << 6) + 0b1000_0000));
return new[] {headByte}.Concat(tailBytes).ToArray();
}

Related

Convert HEX to UTF-16 (GUID partition name) - WPF App (.NET Framework)

I have a string of hex that I want to convert to UTF-16L, as specified at https://en.wikipedia.org/wiki/GUID_Partition_Table under "Partition entries (LBA 2–33)". The string with hex has a fixed length of 72 bytes. I'm not sure what to do to convert it. I was thinking converting it to byte first then use Encoding.BigEndianUnicode Property.
Also when I tried to use Encoding.UTF8.GetChars then I got a lot of spaces in my result.
static void Main(string[] args)
{
string hexString = "4200610073006900630020006400610074006100200070006100720074006900740069006F006E000000000000000000000000000000000000000000000000000000000000000000";
int length = hexString.Length;
byte[] bytes = new byte[length / 2];
for (int i = 0; i < length; i += 2){
bytes[i / 2] = Convert.ToByte(hexString.Substring(i, 2), 16);
}
char[] chars = Encoding.UTF8.GetChars(bytes);
string s = new string(chars);
Console.WriteLine(s);
}
Prints this:
B a s i c d a t a p a r t i t i o n
(B\0a\0s\0i\0c\0 \0d\0a\0t\0a\0 \0p\0a\0r\0t\0i\0t\0i\0o\0n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0)

String of bits to Unicode

I have a string of bits, like this string str = "0111001101101000" It's the letters"sh".
I need to make Unicode letters out of it. I'm doing following:
BitArray bn = new BitArray(str.Length); //creating new bitarray
for (int kat = 0; kat < str.Length; kat++)
{
if (str[kat].ToString() == "0")//adding boolean values into array
{
bn[kat] = false;
}
else
bn[kat] = true;
}
byte[] bytes = new byte[bn.Length];//converting to bytes
bn.CopyTo(bytes, 0);
string output = Encoding.Unicode.GetString(bytes); //encoding
textBox2.Text = output; // result in textbox
But the output text is just complete mess. How to do it right?
There's a couple of problems with your code.
First BitArray will reverse the bit order - it's easier to use
Convert.ToByte
Your input string contains two bytes (one
per character), but you're using Encoding.Unicode to decode it, which
is UTF16 encoding (two bytes per character), you need to use Encoding.UTF8
Working Code
string str = "0111001101101000";
int numOfBytes = str.Length / 8;
byte[] bytes = new byte[numOfBytes];
for (int i = 0; i < numOfBytes; ++i)
{
bytes[i] = Convert.ToByte(str.Substring(8 * i, 8), 2);
}
string output = Encoding.UTF8.GetString(bytes);
A) Your string is ASCII, not UNICODE: 8 bits per character
B) The most significant bit of every byte is on the left, so the strange math used in bn[...]
C) The commented part is useless because "false" is the default state of a BitArray
D) The length of the byte array was wrong. 8 bits == 1 byte! :-)
string str = "0111001101101000";
BitArray bn = new BitArray(str.Length); //creating new bitarray
for (int kat = 0; kat < str.Length; kat++) {
if (str[kat] == '0')//adding boolean values into array
{
//bn[(kat / 8 * 8) + 7 - (kat % 8)] = false;
} else {
bn[(kat / 8 * 8) + 7 - (kat % 8)] = true;
}
}
// 8 bits in a byte
byte[] bytes = new byte[bn.Length / 8];//converting to bytes
bn.CopyTo(bytes, 0);
string output = Encoding.ASCII.GetString(bytes); //encoding
Probably better:
string str = "0111001101101000";
byte[] bytes = new byte[str.Length / 8];
for (int ix = 0, weight = 128, ix2 = 0; ix < str.Length; ix++) {
if (str[ix] == '1') {
bytes[ix2] += (byte)weight;
}
weight /= 2;
// Every 8 bits we "reset" the weight
// and increment the ix2
if (weight == 0) {
ix2++;
weight = 128;
}
}
string output = Encoding.ASCII.GetString(bytes); //encoding

Encode decoded string with Base64String

I am learning how to encode and decode string. This is a method to decode chiper text to plain text I found around the web.
public static string Decode(string chiperText)
{
byte[] numArray = Convert.FromBase64String(chiperText);
byte[] numArray1 = new byte[(int)numArray.Length - 1];
byte num = (byte)(numArray[0] ^ 188);
for (int i = 1; i < (int)numArray.Length; i++)
{
numArray1[i - 1] = (byte)(numArray[i] ^ 188 ^ num);
}
return Encoding.ASCII.GetString(numArray1);
}
My problem is I have no idea how to encode to original state. I try this method and it doesn't work.
public static string Encode(string plainText)
{
byte[] bytes = Encoding.ASCII.GetBytes(plainText);
byte[] results = new byte[(int)bytes.Length - 1];
byte num = (byte)(bytes[0] ^ 188);
for (int i = 1; i < bytes.Length; i++)
{
results[i - 1] = (byte)(bytes[i] ^ 188 ^ num);
}
return Convert.ToBase64String(results);
}
Although I agree entirely with SLaks comment that the above does not constitute any kind of crypto that you should use, the following procedure will produce the "encrypted" data that you are looking to decrypt:
public static string Encode(string plainText)
{
byte[] numArray = System.Text.Encoding.Default.GetBytes(plainText);
byte[] numArray1 = new byte[(int)numArray.Length + 1];
// Generate a random byte as the seed used
(new Random()).NextBytes(numArray1);
byte num = (byte)(numArray1[0] ^ 188);
numArray1[0] = numArray1[0];
for (int i = 0; i < (int)numArray.Length; i++)
{
numArray1[i + 1] = (byte)(num ^ 188 ^ numArray[i]);
}
return Convert.ToBase64String(numArray1);
}
Please do not, for a single second, consider using this as a method for 'encrypting' sensitive data.

An efficient way to Base64 encode a byte array?

I have a byte[] and I'm looking for the most efficient way to base64 encode it.
The problem is that the built in .Net method Convert.FromBase64CharArray requires a char[] as an input, and converting my byte[] to a char[] just to convert it again to a base64 encoded array seems pretty stupid.
Is there any more direct way to do it?
[[EDIT:]] I'll expaling what I want to acheive better - I have a byte[] and I need to return a new base64 encoded byte[]
Byte[] -> String:
use
system.convert.tobase64string
Convert.ToBase64String(byte[] data)
String -> Byte[]:
use
system.convert.frombase64string
Convert.FromBase64String(string data)
Base64 is a way to represent bytes in a textual form (as a string). So there is no such thing as a Base64 encoded byte[]. You'd have a base64 encoded string, which you could decode back to a byte[].
However, if you want to end up with a byte array, you could take the base64 encoded string and convert it to a byte array, like:
string base64String = Convert.ToBase64String(bytes);
byte[] stringBytes = Encoding.ASCII.GetBytes(base64String);
This, however, makes no sense because the best way to represent a byte[] as a byte[], is the byte[] itself :)
Here is the code to base64 encode directly to byte array (tested to be performing +-10% of .Net Implementation, but allocates half the memory):
static public void testBase64EncodeToBuffer()
{
for (int i = 1; i < 200; ++i)
{
// prep test data
byte[] testData = new byte[i];
for (int j = 0; j < i; ++j)
testData[j] = (byte)(j ^ i);
// test
testBase64(testData);
}
}
static void testBase64(byte[] data)
{
if (!appendBase64(data, 0, data.Length, false).SequenceEqual(System.Text.Encoding.ASCII.GetBytes(Convert.ToBase64String(data)))) throw new Exception("Base 64 encoding failed");
}
static public byte[] appendBase64(byte[] data
, int offset
, int size
, bool addLineBreaks = false)
{
byte[] buffer;
int bufferPos = 0;
int requiredSize = (4 * ((size + 2) / 3));
// size/76*2 for 2 line break characters
if (addLineBreaks) requiredSize += requiredSize + (requiredSize / 38);
buffer = new byte[requiredSize];
UInt32 octet_a;
UInt32 octet_b;
UInt32 octet_c;
UInt32 triple;
int lineCount = 0;
int sizeMod = size - (size % 3);
// adding all data triplets
for (; offset < sizeMod;)
{
octet_a = data[offset++];
octet_b = data[offset++];
octet_c = data[offset++];
triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;
buffer[bufferPos++] = base64EncodingTable[(triple >> 3 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 2 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 1 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 0 * 6) & 0x3F];
if (addLineBreaks)
{
if (++lineCount == 19)
{
buffer[bufferPos++] = 13;
buffer[bufferPos++] = 10;
lineCount = 0;
}
}
}
// last bytes
if (sizeMod < size)
{
octet_a = offset < size ? data[offset++] : (UInt32)0;
octet_b = offset < size ? data[offset++] : (UInt32)0;
octet_c = (UInt32)0; // last character is definitely padded
triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;
buffer[bufferPos++] = base64EncodingTable[(triple >> 3 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 2 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 1 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 0 * 6) & 0x3F];
// add padding '='
sizeMod = size % 3;
// last character is definitely padded
buffer[bufferPos - 1] = (byte)'=';
if (sizeMod == 1) buffer[bufferPos - 2] = (byte)'=';
}
return buffer;
}
byte[] base64EncodedStringBytes = Encoding.ASCII.GetBytes(Convert.ToBase64String(binaryData))
Based on your edit and comments.. would this be what you're after?
byte[] newByteArray = UTF8Encoding.UTF8.GetBytes(Convert.ToBase64String(currentByteArray));
You could use the String Convert.ToBase64String(byte[]) to encode the byte array into a base64 string, then Byte[] Convert.FromBase64String(string) to convert the resulting string back into a byte array.
To retrieve your image from byte to base64 string....
Model property:
public byte[] NomineePhoto { get; set; }
public string NomineePhoneInBase64Str
{
get {
if (NomineePhoto == null)
return "";
return $"data:image/png;base64,{Convert.ToBase64String(NomineePhoto)}";
}
}
IN view:
<img style="height:50px;width:50px" src="#item.NomineePhoneInBase64Str" />
import base64
encoded = base64.b64encode(b'[3.\x01#\xbcr\xa9/$\xc3\ xe1 "')
print(encoded)
data = base64.b64decode(encoded)
print(data)
This method is efficient irrespective of the characters. For reference I included space, Double Quotes too.
public void ProcessRequest(HttpContext context)
{
string constring = ConfigurationManager.ConnectionStrings["SQL_Connection_String"].ConnectionString;
SqlConnection conn = new SqlConnection(constring);
conn.Open();
SqlCommand cmd = new SqlCommand("select image1 from TestGo where TestId=1", conn);
SqlDataReader dr = cmd.ExecuteReader();
dr.Read();
MemoryStream str = new MemoryStream();
context.Response.Clear();
Byte[] bytes = (Byte[])dr[0];
string d = System.Text.Encoding.Default.GetString(bytes);
byte[] bytes2 = Convert.FromBase64String(d);
//context.Response.Write(d);
Image img = Image.FromStream(new MemoryStream(bytes2));
img.Save(context.Response.OutputStream, ImageFormat.Png);
context.Response.Flush();
str.WriteTo(context.Response.OutputStream);
str.Dispose();
str.Close();
conn.Close();
context.Response.End();
}

String to byte array

I have to convert a string to byte (16 bit) in JavaScript. I can do this in .net in following code but I have to change this for old classic asp App which uses JavaScript.
string strShared_Key = "6fc2e550abc4ea333395346123456789";
int nLength = strShared_Key.Length;
byte[] keyMAC = new byte[nLength / 2];
for (int i = 0; i < nLength; i += 2)
keyMAC[i / 2] = Convert.ToByte(strShared_Key.Substring(i, 2), 16);
This is the JavaScript function but doesn't return same out put as above .net code.
function String2Bin16bit(inputString) {
var str = ""; // string
var arr = []; // byte array
for (var i = 0; i < inputString.length; i += 2) {
// get chunk of two characters and parse to number
arr.push(parseInt(inputString.substr(i, 2), 16));
}
return arr;
}
You want parseInt(x, 16) which will read x as a number and parse it as such bearing in mind that it's in base 16.
var str = "aabbcc"; // string
var arr = []; // byte array
for(var i = 0; i < str.length; i += 2) {
arr.push(parseInt(str.substr(i, 2), 16)); // get chunk of two characters and parse to number
}

Categories

Resources