Unicode symbol not rendering in Push Notification - c#

I have a number of string literals in my C# that include unicode characters. We are using these to send Push Notifications via Azure Notifcations Hub.
When I send one of the strings in this first set below the notification renders with the expected text and emoticon.
"\u26a1 Hello! \u26a1", "Hello world."
"Ready to record? \ud83d\udce3", "Let’s do this!"
"\ud83d\udc40 Got anything ?", "We’d love to hear from you."
Any of the ones in the set below do appear as Push Notifcations but the special symbols, green circle, amber circle and red circle do not. I'll try and grab a screen shot and reedit this
"Signal Update", "Signal turned Red. \u1F534 Tap for more." "Signal
"Update", "Signal is Green. \u1f7e2 Tap for more." "Signal Active",
"Signal is Green. \u1f7e2 Tap for more."
I notice that VS 2022 does not fully highlight the escaped unicode strings that do not work and they all have an escape sequence length greater than 5 chars but that fact is likely a red herring. Here is the VS2022 rendering
Note the text "...Amber, \u1f7e1 Tap for more". This is how that is rendered in a Push Notification
Note the "1" after the supposed symbol

The reason that my unicode characters were not displaying correctly was that I was not escaping them properly in the C# string literal.
26A1 is an example of the strings that were displaying correctly. This string is from the UTF16 range and can be escaped with the escape sequence starting "\u", note the lower case "u".
1F7E2 is an example of the strings that were not displaying correctly. I naively copied the earlier escape sequence and just changed the characters to be escaped. However this unicode string is from the UTF32 range and needs a different escape sequence, the uppercase u, "\U" making the string literal "Signal is Green. \U0001F7E2 Tap for more." Note the leading zeroes too.
References that I found useful were
Escaping characters in C# strings
What's the difference between ASCII and Unicode?

Related

Categorizing this Thai character using the .NET framework

I'm trying to parse some Thai text according to the rules explained here http://www.thai-language.com/ref/spacing
Basically, I want to find strings of characters between whitespace and punctuation similar to how we would do in English. I realise that words themselves are not necessarily split by spaces in Thai, that's OK.
To parse the text I tried simply looping, like
while( Char.IsLetterOrDigit(theText[i++]) ) {}
to find the next character that isn't a letter or digit. That works except for certain characters like this one
which is the second character in this word (I think that's the character 'superscripting' the first character in the word).
This character doesn't seem to be categorized as anything by the Char class, ie:
Char.IsLowSurrogate((char)3657)
Char.IsPunctuation((char)3657)
Char.IsWhiteSpace((char)3657)
Char.IsSymbol((char)3657)
Char.IsSeparator((char)3657)
Char.IsDigit((char)3657)
Char.IsControl((char)3657)
Char.IsLetter((char)3657)
Char.IsSurrogate((char)3657)
all return false.
This character might be a 'tone' - how would that be identified using .NET?
According to Unicode specifications the character is mai tho and is in the group “mark, nonspacing (Mn).”
You can use the Char.GetUnicodeCategory() method to check the type. For non-spacing marks the type is 5, or UnicodeCategory.NonSpacingMark

Best way to parse ASCII(?) from a hex string in C#

the string I get in the application includes ASCII(?) characters like !,dp,\b,(,s#.
These are suppose to be equivalent.
value in database-
\x01\x01\x03!\xea\x01\x00\x00dP\x00\x00\x1f\x8b\b\x00\x00\x00\x00\x00\x04\x00\xe3\xe6\x10\x11\x98\xc3(\xc1\xa2\xc0\xa8\xc0\xa0 \x02\xc4\x0c\x1a\x8c\x1a\x0c\x1as#\x04\x18\xf2\b\x1de\xe6\xe6\xe2\xe2b604\x14`\x94\x98\xc3\ba\x9b\"\xb1M\x80\xec\xc9\x10\xb6\x81\x05\x90=\t\xca6Ab[\x02\xd9\x13\xa1\xea\x8d\x80\xec.\xa8\xb8)\x12\xdb\x0c\xc8n\x81\xaa1\x06\xb2\x1b\x19\xb98A\xe2 \xf5\xb5\x10\xa6\x01\x90Y\rf\x1a\x9a#\x98\x16\b&\xc8\x8cJ\x88Z\x90\x11\xa5\x10Q\x90\xb6\x12\x88(H[1\x84\t\xf2O\xb6\xc0&v\tF\x1e\xa1\a\x8c\xc3\xd9\x8f\x8f\x8d%\x18\x01\xa1\x98\x8d\x97\xea\x01\x00\x00
value I get in my app that includes chracters I don't want-
01010321ea010000645000001f8b0800000000000400e3e6101198c328c1a2c0a8c0a02002c40c1a8c1a0c1a73400418f2081d65e6e6e2e26236303414609498c308619b22b14d80ecc910b68105903d09ca3641625b02d913a1ea8d80ec2ea8b82912db0cc86e81aa3106b21b19b93841e220f5b510a60190590d661a9a2398160826c88c4a885a9011a5105190b6128828485b318409f24fb6c0267609461ea1078cc3d98f8f8d251801a1988d97ea0100000a\n\n"3a1ea8d80ec2ea8b82912db0cc86e81aa3106b21b19b93841e220f5b510a60190590d661a9a2398160826c88c4a885a9011a5105190b6128828485b318409f24fb6c0267609461ea1078cc3d98f8f8d251801a1988d97ea0100000a\n\n"3a1ea8d80ec2ea8b82912db0cc86e81aa3106b21b19b93841e220f5b510a60190590d661a9a2398160826c88c4a885a9011a5105190b6128828485b318409f24fb6c0267609461ea1078cc3d98f8f8d251801a1988d97ea0100000a\n\n
you can see that \x01 is 01 then \x03 is 03 then ! is 21. I want to take out all the non hex values in the second string.
What are chracters like ! and dP. Are they ASCII?
I can remove characters like new line like hexString = hexString.Replace("\n", ""); But I'm not sure if that's the best way to do for all.
3.Comparing the two strings, I see that (=28 and s#=7340 . Is there a table for conversion for this?
My guess is given the quotes around the ouput that the database is displaying non-ASCII (Unicode?) characters as hex (e.g. \x03) and that the actual string contains a single character for each hex formatted display, in which case there is no difference to pick out - the character d is also the hex value \x64, it is just the database chooses to output visible characters as their normal letter - same thing with \t which could be output as \x09 but they choose to use (C) standard control character abbreviations.
Found this:
When it is displayed on screen, redis-cli escapes non-printable characters using the \xHH encoding format, where HH is hexadecimal notation.
In other words,
The cli is just using 3 different methods to display the values in the database field:
The character is printable, output the character (e.g. d, P, !, ").
The character is not printable, but has a C language standard escape sequence, output the escape sequence (e.g. \b, \t, \n).
The character is not printable and has no escape sequence, output the hex for the value of the character (e.g. \x03, \x01, \x00).

How to display non-printable Ascii characters?

I came across this simple code to output ascii to the console:
Console.Write((char)1); //Output ☺
The thing is, it only works when I change the fonts of the console to RasterFonts, and it's ugly. I mean, look at those old text-based games, how did they draw some ascii art like this?
The Amazing Adventures of ANSI Dude, Snipes
How can I draw nice Ascii on that console?
Unless for some reason you are restricted to use ASCII characters you should use proper Unicode characters. It will avoid potential conflicts with mapping control characters (0-31) to printable characters and let you use lines and borders directly with .Net String type without going through encodings (since line and borders are part of "extended ASCII" and not mapped directly to Unicode characters unlike regular 7 bit ASCII codes 1-127).
Unicode "\u263a" would produce face you are looking for. For the borders and lines drawing use characters from Unicode box drawing range, for more characters see overall table http://unicode.org/charts/.

How to present a character Unicode with 5 digit (Hex) with c# language

I want to print a Unicode character with 5 hexadecimal digits on the screen (for example to write it on a Windows Forms button).
For example, the Unicode of the character Ace Heart is 1F0B1. I tried it with \x but it can present up to 4 digits.
You can use the \U escape sequence:
string text = "Ace of hearts: \U0001f0b1";
Of course, you'll have to be using a font which supports that character...
As an aside, I'd strongly recommend avoiding the \x escape sequence, as they're hard to read. For example:
string good = "Bell: \x7Good compiler";
string bad = "Bell: \x7Bad compiler";
When presented together, at first glance it would seem that these are both "Bell: " followed by U+0007 followed by either "Good compiler" or "Bad" compiler... but because "Bad" is entirely composed of valid hex characters, the second string is actually "Bell: " followed by U+7BAD followed by " compiler".

Can I get my richtextbox to send a carriage return instead of a newline character

I have a terminal emulator app that sends characters to a hardware device. I have implemented the terminal using a richtextbox. When the user types I read the characters from the richtextbox and send them to a hardware device via a serial port.
The hardware device expects a command string terminated with carriage return \r. When I extract the text characters from my rich text box much to my chagrin when the user hits the enter key all I get is the newline character \n.
I can replace the newline character with a carriage return on the extracted text easily enough but I was curious is there a way to massage the richtextbox so it will send a \r instead of a \n?
even better would be that it sends a \r \n when the user hits the enter key?
I'm not sure a rich text control can display a carriage return character properly anyway. I would stick with post-processing. A simple find and replace before sending the text to your hardware device will do the trick.
You could catch the user input on the Textchanged-Event...
With this you get the latest character written in the RichTextBox.
If it's a carriage return you replace it with a linefeed.
The RichTextBox is able to display both NewLine-Chars. =)
Hope it's helpful for you..

Categories

Resources