Convert emoji to hex number - c#

I'm trying to convert an emoji to an hex number or a string.
there is any way to convert this 👱 in this : 0x00000000D83DDC71L or D83DDC71
Edit
my code is this:
var bytes = Encoding.UTF8.GetBytes(emoji.ToString()); //emoji is 👱
var number = BitConverter.ToUInt32(bytes, 0); //number is 2610470896
var emojiCode = unicode.ToString("X"); // emojiCode is 9B989FF0
the problem is that i need my emojiCode to be D83DDC71
i hope is more clear now.

You have to do something like:
var str = "\uD83D\uDC71";
string res = BitConverter.ToString(Encoding.BigEndianUnicode.GetBytes(str)).Replace("-", "");
Note that you want your Unicode string to be in "big endian" mode (so Encoding.BigEndianUnicode)
Probably easier without going through the Encoding conversion:
string res = string.Concat(str.Select(x => ((ushort)x).ToString("X4")));
(ushort and char are nearly the same thing, but ushort is built to be formatted as a number, while char is built to be formatted as a character)

Emoji Unicode is not a single hex number, and it only encoding by UTF32.
So you could split it, like this:
byte[] utfBytes = System.Text.Encoding.UTF32.GetBytes("👱");
print(utfBytes.Length);
for (int i = 0; i < utfBytes.Length; i += 4)
{
if (i != 0) result += '-';
result += System.BitConverter.ToInt32(utfBytes, i).ToString("x2").ToUpper();
}

Related

Convert from hex string (from UCS-2) into UTF-8

I'm using a third party SMS provider and have hit an issue with converting from UCS-2 messages back into readable text.
Their API documentation has this code sample which converts UCS-2 messges into what I'm picking up on the API.
string message = "Это тестовое сообщение юникода";
byte[] ba = Encoding.BigEndianUnicode.GetBytes (message);
var hexString = BitConverter.ToString (ba);
Console.WriteLine ("#U" + hexString.Replace("-",""));
Which converts the message string into
#U042D0442043E00200442043504410442043E0432043E043500200441043E043E043104490435043D043804350020044E043D0438043A043E04340430
This looks like the UCS-2 messages I'm picking up from their API.
Unfortunately they don't give any code samples of how to convert the messages back into a readable form.
I'm sure its not there in the docs because its something simple - but I just seem to figure out how to do it.
To reverse what you have (the string of hex prefixed with #U)
var message = "Это тестовое сообщение юникода";
var ba = Encoding.BigEndianUnicode.GetBytes(message);
var hexString = BitConverter.ToString(ba);
var encoded = "#U" + hexString.Replace("-", "");
Console.WriteLine(encoded);
// reverse
var bytes = Enumerable.Range(2, encoded.Length-2)
.Where(x => x % 2 == 0)
.Select(x => Convert.ToByte(encoded.Substring(x, 2), 16))
.ToArray();
var result = Encoding.BigEndianUnicode.GetString(bytes);
Console.WriteLine(result);
Output
#U042D0442043E00200442043504410442043E0432043E043500200441043E043E043104490435043D043804350020044E043D0438043A043E04340430
Это тестовое сообщение юникода
Demo here
looks like this would be the reverse:
string message = Encoding.BigEndianUnicode.GetString(ba);
The extraction of bytes could be done by such a method:
private IEnumerable<byte> GetTheBytes(string uc2Message)
{
string bytesOnly = uc2Message.Trim('#', 'U');
for (int i = 0; i < bytesOnly.Length-2; i+=2)
{
yield return Convert.ToByte($"{bytesOnly[i]}{bytesOnly[i+1]}", 16);
}
}
Console.WriteLine(Encoding.BigEndianUnicode.GetString(GetTheBytes(uc2Message).ToArray()));

How to get ASCII value of characters in C#

Below is my string in C# which I am converting it to Character array & in need to get the ASCII value of each character in the string.
static void Main(string[] args)
{
string s = "Test";
var arr = s.ToCharArray();
foreach(var a in arr)
{
var n = Encoding.ASCII.GetByteCount(a.ToString());
Console.WriteLine(a);
Console.WriteLine(n);
}
}
This outputs as
T
1
e
1
s
1
t
1
On googling I got number of links but none of them suffice my need.
How to get ASCII value of string in C#
https://www.codeproject.com/Questions/516802/ConvertingpluscharsplustoplusASCIIplusinplusC
I am in need to get the ASCII value of each character in string.???
Any help/suggestion highly appreciated.
A string can be directly enumerated to a IEnumerable<char>. And each char can be casted to a integer to see its UNICODE "value" (code point). UTF-16 maps the 128 characters of ASCII (0-127) to the UNICODE code points 0-127 (see for example https://en.wikipedia.org/wiki/Code_point), so you can directly print this number.
string s = "Test";
foreach (char a in s)
{
if (a > 127)
{
throw new Exception(string.Format(#"{0} (code \u{1:X04}) is not ASCII!", a, (int)a));
}
Console.WriteLine("{0}: {1}", a, (int)a);
}
GetByteCount will return the count of bytes used, so for each character it will be 1 byte.
Try GetBytes
static void Main(string[] args)
{
string s = "Test";
var n = ASCIIEncoding.ASCII.GetBytes(s);
for (int i = 0; i < s.Length; i++)
{
Console.WriteLine($"Char {s[i]} - byte {n[i]}");
}
}
Every character is represented in the ASCII table with a value between 0 and 127. Converting the chars to an Integer you will be able to get the ASCII value.
static void Main(string[] args)
{
string s = "Test";
for (int i = 0; i < s.Length; i++)
{
//Convert one by one every leter from the string in ASCII value.
int value = s[i];
Console.WriteLine(value);
}
}
You're asking for the byte count when you should be asking for the bytes themselves. Use Encoding.ASCII.GetBytes instead of Encoding.ASCII.GetByteCount. Like in this answer: https://stackoverflow.com/a/400777/3129333
Console.WriteLine(a);
Console.WriteLine(((int)a).ToString("X"));
You need to convert in int and then in hex.
GetByteCount will return the count of bytes used, so for each character it will be 1.
You can read also: Need to convert string/char to ascii values

need to display the "Hebrew" characters

I have the hexadecimal string of "000302A502B002B202B002B9000302BA02A502A702A902B9" and I need to display the "Hebrew" characters for it.
How can I convert it to Hebrew in Windows form.
Below is my tried code: (by putting in a loop until the string exists)
string hexChar = hexEncodedText.Substring(0, 4);
decodedText += (char)Int64.Parse(hexChar, System.Globalization.NumberStyles.HexNumber);
hexEncodedText = hexEncodedText.Substring(limit, hexEncodedText.Length - limit);
But this does not produce me the expected result :
Normally this should work, however I've tested it and the result string is nothing. I suppose your Hebrew in Windows codepage is 1255:
string input = "000302A502B002B202B002B9000302BA02A502A702A902B9";
byte[] bytes = new byte[input.Length/2];
for (int i = 0; i < input.Length; i += 2){
bytes[i / 2] = byte.Parse(input.Substring(i, 2), System.Globalization.NumberStyles.HexNumber);
}
Encoding encode = Encoding.GetEncoding(1255);
string output = encode.GetString(bytes);
I think the input string is just that.

code translation: repeating a string until some maximum

I was wondering if you could tell me what the most efficient way to repeat a string would be. I need to create a string 33554432 bytes long, repeating the string "hello, world" until it fills that buffer. What is the best way to do it, C is easy in this case:
for (i = 0; i < BIGSTRINGLEN - step; i += step)
memcpy(bigstring+i, *s, step);
Thanks.
An efficient way would be to use a StringBuilder:
string text = "hello, world";
StringBuilder builder = new StringBuilder(BIGSTRINGLEN);
while (builder.Length + text.Length <= BIGSTRINGLEN) {
builder.Append(text);
}
string result = builder.ToString();
First, do you want the string to be 33554432 bytes long, or characters long? .NET and C# use 16-bit characters, so they are not equivalent.
If you want 33554432 characters, naive solution would be string concatenation. See Frédéric Hamidi's answer.
If you want bytes, you will need to do something a bit more interesting:
int targetLength = 33554432;
string filler = "hello, world";
byte[] target = new byte[targetLength];
// Convert filler to bytes. Can use other encodings here.
// I am using ASCII to match C++ output.
byte[] fillerBytes = Encoding.ASCII.GetBytes(filler);
//byte[] fillerBytes = Encoding.Unicode.GetBytes(filler);
//byte[] fillerBytes = Encoding.UTF8.GetBytes(filler);
int position = 0;
while((position + fillerBytes.Length) < target.Length)
{
fillerBytes.CopyTo(target, position);
position += fillerBytes.Length;
}
// At this point, need to possibly do a partial copy.
if (position < target.Length)
{
int bytesNecessary = target.Length - position;
Array.Copy(fillerBytes, 0, target, position, bytesNecessary);
}
I don't know if it's the most efficient way, but if you're using .NET 3.5 or later, this could work:
String.Join("", System.Linq.Enumerable.Repeat("hello, world", 2796203).ToArray()).Substring(0, 33554432);
If the length you want is dynamic, then you can replace some of the hard-coded numbers with simple math.
What about this? Set the StringBuilder to the max expected size and then add the desired string as long as adding another one will not exceed the desired max size.
StringBuilder sb = new StringBuilder(33554432);
int max = sb.MaxCapacity;
String hello = "hello, world";
while (sb.Length + hello.Length <= max)
{
sb.Append(hello);
}
string longString = sb.ToString();
This avoids a loop that repeatedly adds the string. Instead, I "double" the string until it gets close to the right length and then I put the "doubled" pieces together appropriately.
static string Repeat(string s, int length) {
if (length < s.Length) {
return s.Substring(0, length);
}
var list = new List<string>();
StringBuilder t = new StringBuilder(s);
do {
string temp = t.ToString();
list.Add(temp);
t.Append(temp);
} while(t.Length < length);
int index = list.Count - 1;
StringBuilder sb = new StringBuilder(length);
while (sb.Length < length) {
while (list[index].Length > length) {
index--;
}
if (list[index].Length <= length - sb.Length) {
sb.Append(list[index]);
}
else {
sb.Append(list[index].Substring(0, length - sb.Length));
}
}
return sb.ToString();
}
So, for example, on input ("Hello, world!", 64) we build the strings
13: Hello, World!
26: Hello, World!Hello, World!
52: Hello, World!Hello, World!Hello, World!Hello, World!
Then we would build the result by concatenating the string of length 52 to the substring of length 12 of the string of length 13.
I am, of course, assuming that by bytes you meant length. Otherwise, you can easily modify the above using encodings to get what you want in terms of bytes.

Format string with dashes

I have a compressed string value I'm extracting from an import file. I need to format this into a parcel number, which is formatted as follows: ##-##-##-###-###. So therefore, the string "410151000640" should become "41-01-51-000-640". I can do this with the following code:
String.Format("{0:##-##-##-###-###}", Convert.ToInt64("410151000640"));
However, The string may not be all numbers; it could have a letter or two in there, and thus the conversion to the int will fail. Is there a way to do this on a string so every character, regardless of if it is a number or letter, will fit into the format correctly?
Regex.Replace("410151000640", #"^(.{2})(.{2})(.{2})(.{3})(.{3})$", "$1-$2-$3-$4-$5");
Or the slightly shorter version
Regex.Replace("410151000640", #"^(..)(..)(..)(...)(...)$", "$1-$2-$3-$4-$5");
I would approach this by having your own formatting method, as long as you know that the "Parcel Number" always conforms to a specific rule.
public static string FormatParcelNumber(string input)
{
if(input.length != 12)
throw new FormatException("Invalid parcel number. Must be 12 characters");
return String.Format("{0}-{1}-{2}-{3}-{4}",
input.Substring(0,2),
input.Substring(2,2),
input.Substring(4,2),
input.Substring(6,3),
input.Substring(9,3));
}
This should work in your case:
string value = "410151000640";
for( int i = 2; i < value.Length; i+=3){
value = value.Insert( i, "-");
}
Now value contains the string with dashes inserted.
EDIT
I just now saw that you didn't have dashes between every second number all the way, to this will require a small tweak (and makes it a bit more clumsy also I'm afraid)
string value = "410151000640";
for( int i = 2; i < value.Length-1; i+=3){
if( value.Count( c => c == '-') >= 3) i++;
value = value.Insert( i, "-");
}
If its part of UI you can use MaskedTextProvider in System.ComponentModel
MaskedTextProvider prov = new MaskedTextProvider("aa-aa-aa-aaa-aaa");
prov.Set("41x151000a40");
string result = prov.ToDisplayString();
Here is a simple extension method with some utility:
public static string WithMask(this string s, string mask)
{
var slen = Math.Min(s.Length, mask.Length);
var charArray = new char[mask.Length];
var sPos = s.Length - 1;
for (var i = mask.Length - 1; i >= 0 && sPos >= 0;)
if (mask[i] == '#') charArray[i--] = s[sPos--];
else
charArray[i] = mask[i--];
return new string(charArray);
}
Use it as follows:
var s = "276000017812008";
var mask = "###-##-##-##-###-###";
var dashedS = s.WithMask(mask);
You can use it with any string and any character other than # in the mask will be inserted. The mask will work from right to left. You can tweak it to go the other way if you want.
Have fun.
If i understodd you correctly youre looking for a function that removes all letters from a string, aren't you?
I have created this on the fly, maybe you can convert it into c# if it's what you're looking for:
Dim str As String = "410151000vb640"
str = String.Format("{0:##-##-##-###-###}", Convert.ToInt64(MakeNumber(str)))
Public Function MakeNumber(ByVal stringInt As String) As String
Dim sb As New System.Text.StringBuilder
For i As Int32 = 0 To stringInt.Length - 1
If Char.IsDigit(stringInt(i)) Then
sb.Append(stringInt(i))
End If
Next
Return sb.ToString
End Function

Categories

Resources