HEX String to Chinese String - c#

I have the following code to convert from HEX to ASCII.
//Hexadecimal to ASCII Convertion
private static string hex2ascii(string hexString)
{
MessageBox.Show(hexString);
StringBuilder sb = new StringBuilder();
for (int i = 0; i <= hexString.Length - 2; i += 2)
{
sb.Append(Convert.ToString(Convert.ToChar(Int32.Parse(hexString.Substring(i, 2), System.Globalization.NumberStyles.HexNumber))));
}
return sb.ToString();
}
input hexString = D3FCC4A7B6FABBB7
output return = Óüħ¶ú»·
The output that I need is 狱魔耳环, but I am getting Óüħ¶ú»· instead.
How would I make it display the correct string?

First, convert the hex string to a byte[], e.g. using code at How do you convert Byte Array to Hexadecimal String, and vice versa?. Then use System.Text.Encoding.Unicode.GetString(myArray) (use proper encoding, might not be Unicode, but judging from your example it is a 16-bit encoding, which, incidentally, is not "ASCII", which is 7-bit) to convert it to a string.

Related

Convert Unicode string into proper string

I have a string which contains unicode data.
I want to write it in a file . When the data is written in file it gives me simple unicode value instead of languages other than english.
string originalString = ((char)(buffer[index])).ToString();
//sb.Append(DecodeEncodedNonAsciiCharacters(originalString.ToString()));
foreach (char c1 in originalString)
{
// test if char is ascii, otherwise convert to Unicode Code Point
int cint = Convert.ToInt32(c1);
if (cint <= 127 && cint >= 0)
asAscii.Append(c1.ToString());
else
{
//String s = Char.ConvertFromUtf32(cint);
asAscii.Append(String.Format("\\u{0:x4} ", cint).Trim());
// asAscii.Append(s);
}
}
sb.Append((asAscii));
Console.WriteLine();
when i see the output file the data shows like this
1 00:00:27,709-->00:00:32,959
1.2 \u00e0\u00a4\u0085\u00e0\u00a4\u00b0\u00e0\u00a4\u00ac \u00e0\u00a4\u00b2\u00e0\u00a5\u008b\u00e0\u00a4\u0097 28
\u00e0\u00a4\u00b0\u00e0\u00a4\u00be\u00e0\u00a4\u009c\u00e0\u00a5\u008d\u00e0\u00a4\u00af
\u00e0\u00a4\u0094\u00e0\u00a4\u00b0
\u00e0\u00a4\u00b8\u00e0\u00a4\u00be\u00e0\u00a4\u00a4
\u00e0\u00a4\u0095\u00e0\u00a5\u0087\u00e0\u00a4\u0082\u00e0\u00a4\u00a6\u00e0\u00a5\u008d\u00e0\u00a4\u00b0
\u00e0\u00a4\u00b6\u00e0\u00a4\u00be\u00e0\u00a4\u00b8\u00e0\u00a4\u00bf\u00e0\u00a4\u00a4
\u00e0\u00a4\u00aa\u00e0\u00a5\u008d\u00e0\u00a4\u00b0\u00e0\u00a4\u00a6\u00e0\u00a5\u0087\u00e0\u00a4\u00b6
but it should look like this
1 00:00:27,400 --> 00:00:32,760
1.2 अरब लोग 28 राज्य और सात केंद्र शासित प्रदेश
I have tried many things but none has done my job.
string unicodeString = "This string contains the unicode character Pi(\u03a0)";
// Create two different encodings.
Encoding ascii = Encoding.ASCII;
Encoding unicode = Encoding.Unicode;
// Convert the string into a byte[].
byte[] unicodeBytes = unicode.GetBytes(unicodeString);
// Perform the conversion from one encoding to the other.
byte[] asciiBytes = Encoding.Convert(unicode, ascii, unicodeBytes);
// Convert the new byte[] into a char[] and then into a string.
// This is a slightly different approach to converting to illustrate
// the use of GetCharCount/GetChars.
char[] asciiChars = new char[ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)];
ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);
string asciiString = new string(asciiChars);
// Display the strings created before and after the conversion.
Console.WriteLine("Original string: {0}", unicodeString);
Console.WriteLine("Ascii converted string: {0}", asciiString);

C# Character from Unicode

I have a unicode character from the FontAwesome cheat sheet:
#xf042;
How do I put that character into c# ?
string s = "????";
I have tried entering it is as and using a .
If you just want a lighter version of what Darin posted to convert the hex value to a string containing the unicode position from the private area of the FontAwesome font, you can use this >>
private static string UnicodeToChar( string hex ) {
int code=int.Parse( hex, System.Globalization.NumberStyles.HexNumber );
string unicodeString=char.ConvertFromUtf32( code );
return unicodeString;
}
Just call it as follows >>
string s = UnicodeToChar( "f042" );
Alternatively, you can simply use the C# class with all the icons and loader pre-written here >> FontAwesome For WinForms CSharp
Assuming the hex input represents UTF8 encoded string you could have a function that will convert a HEX string:
public static string ConvertHexToString(string hex)
{
int numberChars = hex.Length;
byte[] bytes = new byte[numberChars / 2];
for (int i = 0; i < numberChars; i += 2)
{
bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
}
return Encoding.UTF8.GetString(bytes);
}
and then filter out the unnecessary characters from your input before feeding it to this function:
string input = "#xf042;";
string s = input.Replace("#x", string.Empty).Replace(";", string.Empty);
string result = ConvertHexToString(s);
Obviously you will need to adjust the correct encoding based on the input, because the hex simply represents a byte array and in order to decode this byte array back to a string you're gonna need to know the encoding.

Are there any rules for the XOR cipher?

I have the following method which takes the plain text and the key text. It is supposed to return a string encrypted with the XOR method as ascii.
public static string encryptXOREng(string plainText,string keyText)
{
StringBuilder chiffreText = new StringBuilder();
byte[] binaryPlainText = System.Text.Encoding.ASCII.GetBytes(plainText);
byte[] binaryKeyText = System.Text.Encoding.ASCII.GetBytes(keyText);
for(int i = 0;i<plainText.Length;i++)
{
int result = binaryPlainText[i] ^ binaryKeyText[i];
chiffreText.Append(Convert.ToChar(result));
}
return chiffreText.ToString();
}
For some characters it runs just fine. But for example if it performs XOR on 'G' & 'M', which is 71 XOR 77 it returns 10. And 10 stands for Line feed. This is then actually not represented by a character in my output. This leads to a plain text of a length being encrypted to a cipher string which is only 2 characters long, in some cases. I suppose this would make a decryption impossible, even with a key? Or are the ascii characters 0 - 31 there but simply not visible?
To avoid non printable chars use Convert.ToBase64String
public static string encryptXOREng(string plainText, string keyText)
{
List<byte> chiffreText = new List<byte>();
byte[] binaryPlainText = System.Text.Encoding.ASCII.GetBytes(plainText);
byte[] binaryKeyText = System.Text.Encoding.ASCII.GetBytes(keyText);
for (int i = 0; i < plainText.Length; i++)
{
int result = binaryPlainText[i] ^ binaryKeyText[i % binaryKeyText.Length];
chiffreText.Add((byte)result);
}
return Convert.ToBase64String(chiffreText.ToArray());
}
PS: In your code you assume keyText is not shorter than plainText, I fixed it also.
As far as i know there are no rules specific to xor-ciphers. Cryptographic functions often output values that are not printable, which makes sense - the result is not supposed to be readable. In stead you may want to use the output bytes directly or a base64 encoded result.
I would do something like:
public static byte[] XORCipher(string plainText, string keyText)
{
byte[] binaryPlainText = System.Text.Encoding.ASCII.GetBytes(plainText);
byte[] binaryKeyText = System.Text.Encoding.ASCII.GetBytes(keyText);
for(int i = 0;i<plainText.Length;i++)
{
binaryPlainText[i] ^= binaryKeyText[i];
}
return binaryPlainText;
}

How to convert a string to byte array

I have a string and want to convert it to a byte array of hex value using C#.
for eg, "Hello World!" to byte[] val=new byte[] {0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x20, 0x57, 0x6F, 0x72, 0x6C, 0x64, 0x21};,
I see the following code in Converting string value to hex decimal
string input = "Hello World!";
char[] values = input.ToCharArray();
foreach (char letter in values)
{
// Get the integral value of the character.
int value = Convert.ToInt32(letter);
// Convert the decimal value to a hexadecimal value in string form.
string hexOutput = String.Format("0x{0:X}", value);
Console.WriteLine("Hexadecimal value of {0} is {1}", letter, hexOutput);
}
I want this value into byte array but can't write like this
byte[] yy = new byte[values.Length];
yy[i] = Convert.ToByte(Convert.ToInt32(hexOutput));
I try this code referenced from How to convert a String to a Hex Byte Array? where I passed the hex value 48656C6C6F20576F726C6421 but I got the decimal value not hex.
public byte[] ToByteArray(String HexString)
{
int NumberChars = HexString.Length;
byte[] bytes = new byte[NumberChars / 2];
for (int i = 0; i < NumberChars; i += 2)
{
bytes[i / 2] = Convert.ToByte(HexString.Substring(i, 2), 16);
}
return bytes;
}
and I also try code from How can I convert a hex string to a byte array?
But once I used Convert.ToByte or byte.Parse , the value change to decimal value.
How should I do?
Thanks in advance
I want to send 0x80 (i.e, 128) to serial port but when I copy and paste the character equivalent to 128 to the variable 'input' and convert to byte, I got 63 (0x3F). So I think I need to send hex array. I think I got the wrong idea. Pls see screen shot.
For now, I solve this to combine byte arrays.
string input = "Hello World!";
byte[] header = new byte[] { 2, 48, 128 };
byte[] body = Encoding.ASCII.GetBytes(input);
Hexadecimal has nothing to do with this, your desired result is nothing more nor less than an array of bytes containing the ASCII codes.
Try Encoding.ASCII.GetBytes(s)
There's something strange with your requirement:
I have a string and want to convert it to a byte array of hex value
using C#.
An byte is just an 8-bit value. You can present it as decimal (e.g. 16) or hexidecimal (e.g. 0x10).
So, what do you realy want?
In case you are really wanting to get a string which contains the hex representation of an array of bytes, here's how you can do that:
public static string BytesAsString(byte[] bytes)
{
string hex = BitConverter.ToString(bytes); // This puts "-" between each value.
return hex.Replace("-",""); // So we remove "-" here.
}
It seems like you’re mixing converting to array and displaying array data.
When you have array of bytes it’s just array of bytes and you can represent it in any possible way binary, decimal, hexadecimal, octal, whatever… but that is only valid if you want to visually represent these.
Here is a code that manually converts string to byte array and then to array of strings in hex format.
string s1 = "Stack Overflow :)";
byte[] bytes = new byte[s1.Length];
for (int i = 0; i < s1.Length; i++)
{
bytes[i] = Convert.ToByte(s1[i]);
}
List<string> hexStrings = new List<string>();
foreach (byte b in bytes)
{
hexStrings.Add(Convert.ToInt32(b).ToString("X"));
}

How to convert UTF-8 byte[] to string

I have a byte[] array that is loaded from a file that I happen to known contains UTF-8.
In some debugging code, I need to convert it to a string. Is there a one-liner that will do this?
Under the covers it should be just an allocation and a memcopy, so even if it is not implemented, it should be possible.
string result = System.Text.Encoding.UTF8.GetString(byteArray);
There're at least four different ways doing this conversion.
Encoding's GetString, but you won't be able to get the original bytes back if those bytes have non-ASCII characters.
BitConverter.ToString The output is a "-" delimited string, but there's no .NET built-in method to convert the string back to byte array.
Convert.ToBase64String You can easily convert the output string back to byte array by using Convert.FromBase64String. Note: The output string could contain '+', '/' and '='. If you want to use the string in a URL, you need to explicitly encode it.
HttpServerUtility.UrlTokenEncodeYou can easily convert the output string back to byte array by using HttpServerUtility.UrlTokenDecode. The output string is already URL friendly! The downside is it needs System.Web assembly if your project is not a web project.
A full example:
byte[] bytes = { 130, 200, 234, 23 }; // A byte array contains non-ASCII (or non-readable) characters
string s1 = Encoding.UTF8.GetString(bytes); // ���
byte[] decBytes1 = Encoding.UTF8.GetBytes(s1); // decBytes1.Length == 10 !!
// decBytes1 not same as bytes
// Using UTF-8 or other Encoding object will get similar results
string s2 = BitConverter.ToString(bytes); // 82-C8-EA-17
String[] tempAry = s2.Split('-');
byte[] decBytes2 = new byte[tempAry.Length];
for (int i = 0; i < tempAry.Length; i++)
decBytes2[i] = Convert.ToByte(tempAry[i], 16);
// decBytes2 same as bytes
string s3 = Convert.ToBase64String(bytes); // gsjqFw==
byte[] decByte3 = Convert.FromBase64String(s3);
// decByte3 same as bytes
string s4 = HttpServerUtility.UrlTokenEncode(bytes); // gsjqFw2
byte[] decBytes4 = HttpServerUtility.UrlTokenDecode(s4);
// decBytes4 same as bytes
A general solution to convert from byte array to string when you don't know the encoding:
static string BytesToStringConverted(byte[] bytes)
{
using (var stream = new MemoryStream(bytes))
{
using (var streamReader = new StreamReader(stream))
{
return streamReader.ReadToEnd();
}
}
}
Definition:
public static string ConvertByteToString(this byte[] source)
{
return source != null ? System.Text.Encoding.UTF8.GetString(source) : null;
}
Using:
string result = input.ConvertByteToString();
Converting a byte[] to a string seems simple, but any kind of encoding is likely to mess up the output string. This little function just works without any unexpected results:
private string ToString(byte[] bytes)
{
string response = string.Empty;
foreach (byte b in bytes)
response += (Char)b;
return response;
}
I saw some answers at this post and it's possible to be considered completed base knowledge, because I have a several approaches in C# Programming to resolve the same problem. The only thing that is necessary to be considered is about a difference between pure UTF-8 and UTF-8 with a BOM.
Last week, at my job, I needed to develop one functionality that outputs CSV files with a BOM and other CSV files with pure UTF-8 (without a BOM). Each CSV file encoding type will be consumed by different non-standardized APIs. One API reads UTF-8 with a BOM and the other API reads without a BOM. I needed to research the references about this concept, reading the "What's the difference between UTF-8 and UTF-8 without BOM?" Stack Overflow question, and the Wikipedia article "Byte order mark" to build my approach.
Finally, my C# Programming for both UTF-8 encoding types (with BOM and pure) needed to be similar to this example below:
// For UTF-8 with BOM, equals shared by Zanoni (at top)
string result = System.Text.Encoding.UTF8.GetString(byteArray);
//for Pure UTF-8 (without B.O.M.)
string result = (new UTF8Encoding(false)).GetString(byteArray);
Using (byte)b.ToString("x2"), Outputs b4b5dfe475e58b67
public static class Ext {
public static string ToHexString(this byte[] hex)
{
if (hex == null) return null;
if (hex.Length == 0) return string.Empty;
var s = new StringBuilder();
foreach (byte b in hex) {
s.Append(b.ToString("x2"));
}
return s.ToString();
}
public static byte[] ToHexBytes(this string hex)
{
if (hex == null) return null;
if (hex.Length == 0) return new byte[0];
int l = hex.Length / 2;
var b = new byte[l];
for (int i = 0; i < l; ++i) {
b[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16);
}
return b;
}
public static bool EqualsTo(this byte[] bytes, byte[] bytesToCompare)
{
if (bytes == null && bytesToCompare == null) return true; // ?
if (bytes == null || bytesToCompare == null) return false;
if (object.ReferenceEquals(bytes, bytesToCompare)) return true;
if (bytes.Length != bytesToCompare.Length) return false;
for (int i = 0; i < bytes.Length; ++i) {
if (bytes[i] != bytesToCompare[i]) return false;
}
return true;
}
}
There is also class UnicodeEncoding, quite simple in usage:
ByteConverter = new UnicodeEncoding();
string stringDataForEncoding = "My Secret Data!";
byte[] dataEncoded = ByteConverter.GetBytes(stringDataForEncoding);
Console.WriteLine("Data after decoding: {0}", ByteConverter.GetString(dataEncoded));
In addition to the selected answer, if you're using .NET 3.5 or .NET 3.5 CE, you have to specify the index of the first byte to decode, and the number of bytes to decode:
string result = System.Text.Encoding.UTF8.GetString(byteArray, 0, byteArray.Length);
Alternatively:
var byteStr = Convert.ToBase64String(bytes);
The BitConverter class can be used to convert a byte[] to string.
var convertedString = BitConverter.ToString(byteAttay);
Documentation of BitConverter class can be fount on MSDN.
To my knowledge none of the given answers guarantee correct behavior with null termination. Until someone shows me differently I wrote my own static class for handling this with the following methods:
// Mimics the functionality of strlen() in c/c++
// Needed because niether StringBuilder or Encoding.*.GetString() handle \0 well
static int StringLength(byte[] buffer, int startIndex = 0)
{
int strlen = 0;
while
(
(startIndex + strlen + 1) < buffer.Length // Make sure incrementing won't break any bounds
&& buffer[startIndex + strlen] != 0 // The typical null terimation check
)
{
++strlen;
}
return strlen;
}
// This is messy, but I haven't found a built-in way in c# that guarentees null termination
public static string ParseBytes(byte[] buffer, out int strlen, int startIndex = 0)
{
strlen = StringLength(buffer, startIndex);
byte[] c_str = new byte[strlen];
Array.Copy(buffer, startIndex, c_str, 0, strlen);
return Encoding.UTF8.GetString(c_str);
}
The reason for the startIndex was in the example I was working on specifically I needed to parse a byte[] as an array of null terminated strings. It can be safely ignored in the simple case
A LINQ one-liner for converting a byte array byteArrFilename read from a file to a pure ASCII C-style zero-terminated string would be this: Handy for reading things like file index tables in old archive formats.
String filename = new String(byteArrFilename.TakeWhile(x => x != 0)
.Select(x => x < 128 ? (Char)x : '?').ToArray());
I use '?' as the default character for anything not pure ASCII here, but that can be changed, of course. If you want to be sure you can detect it, just use '\0' instead, since the TakeWhile at the start ensures that a string built this way cannot possibly contain '\0' values from the input source.
Try this console application:
static void Main(string[] args)
{
//Encoding _UTF8 = Encoding.UTF8;
string[] _mainString = { "Hello, World!" };
Console.WriteLine("Main String: " + _mainString);
// Convert a string to UTF-8 bytes.
byte[] _utf8Bytes = Encoding.UTF8.GetBytes(_mainString[0]);
// Convert UTF-8 bytes to a string.
string _stringuUnicode = Encoding.UTF8.GetString(_utf8Bytes);
Console.WriteLine("String Unicode: " + _stringuUnicode);
}
Here is a result where you didn’t have to bother with encoding. I used it in my network class and send binary objects as string with it.
public static byte[] String2ByteArray(string str)
{
char[] chars = str.ToArray();
byte[] bytes = new byte[chars.Length * 2];
for (int i = 0; i < chars.Length; i++)
Array.Copy(BitConverter.GetBytes(chars[i]), 0, bytes, i * 2, 2);
return bytes;
}
public static string ByteArray2String(byte[] bytes)
{
char[] chars = new char[bytes.Length / 2];
for (int i = 0; i < chars.Length; i++)
chars[i] = BitConverter.ToChar(bytes, i * 2);
return new string(chars);
}
string result = ASCIIEncoding.UTF8.GetString(byteArray);

Categories

Resources