Convert hexadecimal unicode character into its visual representation - c#

I am trying to make a C# program that translates unicode character from its hexadecimal format to a single character, and I have a problem. This is my code:
This works:
char e = Convert.ToChar("\u0066");
However, this doesn't work:
Console.WriteLine("enter unicode format character (for example \\u0066)");
string s = Console.ReadLine();
Console.WriteLine("you entered (for example f)");
char c = Convert.ToChar(s);
Because (Convert.ToChar("\\u0066")) gives the error:
String must be exactly one character long
Anyone have an idea how to do this?

int.Parse doesn't like the "\u" prefix, but if you validate first to ensure that it's there, you can use
char c = (char)int.Parse(s.Substring(2), NumberStyles.HexNumber);
This strips the first two characters from the input string and parses the remaining text.
In order to ensure that the sequence is a valid one, try this:
Regex reg = new Regex(#"^\\u([0-9A-Fa-f]{4})$");
if( reg.IsMatch(s) )
{
char c = (char)int.Parse(s.Substring(2), NumberStyles.HexNumber);
}
else
{
// Error
}

Convert.ToChar("\u0066");
This is a one-character string at run-time, because the compiler processed the backslash sequence.
The rest of your code is dealing with six character strings { '\\', 'u', '0', '0', '6', '6' }, which Convert.ToChar cannot handle.
Try char.Parse (or possibly Int16.Parse(s, NumberStyles.AllowHexSpecifier) followed by a cast to char).

Related

c# cannot convert from string to char

I am trying to write some code to display any symbols present in a password the user gives you. I am quite new and am trying to do use isSymbol but I am stuck. it says cannot convert from string to char
using System;
namespace booleanprok
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Enter a made up password:");
string madeUppw = Console.ReadLine();
Console.WriteLine(char.IsSymbol(madeUppw));
}
}
}
"I am trying to write some code to display any symbols present in a password the user gives you."
Given the above statement, I see the following problems with the sample code given:
you're passing a string to the IsSymbol() method, which expects a char.
you're outputting the return value from the IsSymbol() method (which is a bool) instead of the characters themselves.
IsSymbol() does not return all characters that we typically consider symbols in a password (like !, #, #, etc). From the documentation: "symbols are members of the following categories in UnicodeCategory: MathSymbol, CurrencySymbol, ModifierSymbol, and OtherSymbol."
One way to solve these issues is to consider any character that's not alphabetic or numeric to be a "symbol", which we can do by using the Linq extension method Where() along with the char.IsLetter() and char.IsDigit() methods. Then we can output the characters to the console using string.Join on the results.
For example:
Console.Write("Enter a made up password: ");
string madeUpPwd = Console.ReadLine();
// Get the distinct characters that aren't Letters or Digits
IEnumerable<char> symbols = madeUpPwd
.Where(c => !char.IsLetter(c) && !char.IsDigit(c))
.Distinct();
// Output them to the console (separated by commas and wrapped in single quotes)
Console.WriteLine($"You entered the symbols: '{string.Join("', '", symbols)}'");
Sample Output
(Note that using .Where(char.IsSymbol) would have only return the '$' character)
char.IsSymbol accepts a char argument, but you're passing a parameter of type string. If you're sure the input will only be one character in length, or if you just want the first character and disregard others, you can call char.IsSymbol(madeUppw[0]);
However, you can force reading a single character with Console.ReadKey and get the value with KeyChar:
char madeUppw = Console.ReadKey().KeyChar;
Console.WriteLine(char.IsSymbol(madeUppw));
Convert string to char
Try this one
bool res;
Console.WriteLine("Enter a made up password:");
string madeUppw = Console.ReadLine();
foreach(char s in madeUppw){
res = Char.IsSymbol(s);//The char.issymbol takes characters as parameter
}
If you have a single character string, You can also try
string str = "A";
char character = char.Parse(str);
Or
string str = "A";
char character = str.ToCharArray()[0];
A string consists of 0 or more characters and for validating if any of the characters is an symbol, you need step through each of the characters in the string and validate them individually. You could do so using Enumerable.Any and char.IsSymbol as
string madeUppw = Console.ReadLine();
Console.WriteLine(madeUppw.Any(x=>char.IsSymbol(x)));
Enumerable.Any verifies whether any of the elements in the sequence (in this a string), exists or satisfies a condition (in this, condition is if any of the character is a Symbol).
The last line could further trimmed down as
Console.WriteLine(madeUppw.Any(char.IsSymbol));
If you need to print all the symbols in the string, you could use
Console.WriteLine(string.Join(",",madeUppw.Where(x=>char.IsSymbol(x))));

C# Weird Backslash on Convert.ToChar()

I'm trying to convert a xml character entity to a C# char...
string charString = "₁".Replace("&#", "\\").Replace(";", "");
char c = Convert.ToChar(charString);
I have no idea why it is failing on the Convert.Char line. Even though the debugger shows charString as "\\\\x2081" it really is "\x2081", which is a valid Unicode character. The exception is too many characters.
The documentation for ToChar(string) is quite readable:
Converts the first character of a specified string to a Unicode character.
Also:
FormatException – The length of value is not 1.
It will not convert a hex representation of your character into said character. It will take a one-character string and give you that character back. The same as doing s[0].
What you want is:
string hex = "₁".Replace("&#x", "").Replace(";", "");
char c = (char)Convert.ToInt32(hex, 16);
Convert.ToChar(value) with value is a string of length 1. But charString is "\\x2081" length over 1.
Seems "₁" is Unicode Hex Character Code (Unicode Hex Character Code ₁ ). So you must do that:
string charString = "₁".Replace("&#x", "").Replace(";", "");
char c = (char)Convert.ToInt32(charString , NumberStyles.HexNumber);
Note: It's HTML Entity (hex) of SUBSCRIPT ONE (see in link above ^_^)

No need to use escape seq. for ' in C#?

On MSDN I can read that \' is escape sequence for ' char. However I am able to use it in string without escape sequence like this:
Console.WriteLine("Press 'X' ");
How it is possible?
But how would you write it as a char?
char c = '\'';
char (a single character literal) is a different data type than string (a multi character literal).
In C# a char is declared as:
var c = 'c';
whereas a string is declared as:
var s = "asdf";
As you can see the single quote (') would need to be escaped to declare a char containing the single quote:
var c = '\'';
\' screening is needed for char literals. Reason is that ' can be interpreted as literal boundary character. For strings it is meaningless because there is nothing to confuse with. In strings in turn \" makes sense.
It says that you have to escape ' for a char data type.
char c = '''; // compiler throws error
char c = '\''; // valid

C#: split a string into runs of characters, numbers and delimited strings and process it

OK my regex is a bit rusty and I've been struggling with this particular problem...
I need to split and process a string containing any number of the following, in any order:
Chars (lowercase letters only)
Quote delimited strings
Ints
The strings are pretty weird (I don't have control over them). When there's more than one number in a row in the string they're seperated by a comma. They need to be processed in the same order that they appeared in the original string.
For example, a string might look like:
abc20a"Hi""OK"100,20b
With this particular string the resulting call stack would look a bit like:
ProcessLetters( new[] { 'a', 'b', 'c' } );
ProcessInts( 20 );
ProcessLetters( 'a' );
ProcessStrings( new[] { "Hi", "OK" } );
ProcessInts( new[] { 100, 20 } );
ProcessLetters( 'b' );
What I could do is treat it a bit like CSV, where you build tokens by processing the characters one at a time, but I think it could be more easily done with a regex?
You can use the pattern contained in this string:
#"(""[^""]*""|[a-z]|\d+)"
to tokenize the input string you provided. This pattern captures three things: simple quoted strings (no embeded quotes), lower-case characters, and one or more digits.
If your quoted strings can have escaped quotes within them (e.g., "Hi\"There\"""OK""Pilgrim") then you can use this pattern to capture and tokenize them along with the rest of the input string:
#"((?:""[^""\\]*(?:\\.[^""\\]*)*"")|[a-z]|\d+)"
Here's an example:
MatchCollection matches = Regex.Matches(#"abc20a""Hi\""There\""""""OK""""Pilgrim""100,20b", #"((?:""[^""\\]*(?:\\.[^""\\]*)*"")|[a-z]|\d+)");
foreach (Match match in matches)
{
Console.WriteLine(match.Value);
}
Returns the string tokens:
a
b
c
20
a
"Hi\"There\""
"OK"
"Pilgrim"
100
20
b
One of the nice thing about this approach is you can just check the first character to see what stack you need to put your elements in. If the first character is alpha, then it goes into the ProcessLetters stack, if the character is numeric, then it goes into ProcessInts. If the first character is a quote, then it goes into ProcessStrings after trimming the leading and trailing quotes and calling Regex.Unescape() to unescape the embedded quotes.
You can make your regexp match each of the three separate options with the or operator |. This should catch valid tokens, skipping commas and other chars.
/[a-z]|[0-9]+|"[^"]"/
Can your strings contain escaped quotes?
static void Main(string[] args)
{
string test = #"abc20a""Hi""""OK""100,20b";
string[] results = Regex.Split(test, #"(""[a-zA-Z]+""|\d+|[a-zA-Z]+)");
foreach (string result in results)
{
if (!String.IsNullOrEmpty(result) && result != ",")
{
Console.WriteLine("result: " + result);
}
}
Console.ReadLine();
}

How to get a variable of type char to have a value of ' in c#

A bit of a dumb one... but how do I get a variable of type char to have a value of '?
e. g.
char c = ''';
char a = 'A';
char b = 'B';
char c = '\'';
the backslash is called an escape character.
and if you want a backslash it's
char c = '\\';
Here's some more for good measure:
\t - tab
\n - new line
\uXXXX - where XXXX is the hexadecimal value of a unicode character
char a=65 means 'A' in c++. don't know whether it will work in c#
To complete you answer:
in C#, the following statement won't compile, because you're trying to put an Int32 into a Char (no implicit casting exists)
char a = 65;
To convert ASCII codes to a character, you have to use the static Convert class in C#:
char a = Convert.ToChar(65); // where 65 is the ascii

Categories

Resources