Can you please explain me this code -
return string.Join(string.Empty, checkSum.Select(x => Convert.ToString(x, 2).PadLeft(8, '0')).
checkSum will have binary value like 10101010100011.
Checked in google but didn't find clear explanation.
You can find the PadLeft documentation here:
https://learn.microsoft.com/en-us/dotnet/api/system.string.padleft
It works by "resizing" the string to a certain length by prepending spaces to make it the desired length. That same page includes this example:
string str = "BBQ and Slaw";
Console.WriteLine(str.PadLeft(15)); // Displays " BBQ and Slaw".
Console.WriteLine(str.PadLeft(5)); // Displays "BBQ and Slaw".
Your particular code works such that an array of numbers checkSum is mapped via Select to all be binary numerals not integers (ToString(x, 2)) and the binary form is padded to always be 8 characters but not padded by spaces but by zeroes.
Related
I'm trying to parse some phone numbers, and I have a function to check if the parsed string is made up of only numbers or the + sign.
In some of them there is an hiden character of value 8236.
Comparing it against '\0' and '\u8236' doesnt work...
What is this character and how do I remove it?
Thanks to #Maximilian Gerhardt who sent this link in a comment https://www.fileformat.info/info/unicode/char/202c/index.htm
I was able to know that 8236 corresponds to character '\u202c'
So I did str.Trim('\u202c')
And it did work
edit:
The simple way to get the corresponding code is to convert from decimal to hex.
8236(decimal) -> 202C(hexadecimal)
I had the same issue, but with character 8237, which led me to this post.
This corresponds with character \u202d.
I have what I think is an easy problem. For some reason the following code generates the exception, "String must be exactly one character long".
int n = 0;
foreach (char letter in charMsg)
{
// Get the integral value of the character.
int value = Convert.ToInt32(letter);
// Convert the decimal value to a hexadecimal value in string form.
string hexOutput = String.Format("{0:X}", value);
//Console.WriteLine("Hexadecimal value of {0} is {1}", letter, hexOutput);
charMsg[n] = Convert.ToChar(hexOutput);
n++;
}
The exception occurs at the charMsg[n] = Convert.ToChar(hexOutput); line. Why does it happen? When I check the values of CharMsg it seems to contain all of them properly, yet still throws an error at me.
UPDATE: I've solved this problem, it was my mistake. Sorry for bothering you.
OK, this was a really stupid mistake on my part. Point is, with my problem I'm not even supposed to do this as hex values clearly won't help me in any way.
What I am trying to do it to encrypt a message in an image. I've already encrypted the length of said message in last digits on each color channel of first pixel. Now I'm trying to put the very message in there. I lookt here: http://en.wikipedia.org/wiki/ASCII and said to myself without thinking that usung hexes would be a good idea. Can't belive I thought that.
Convert.ToChar( string s ), per the documentation requires a single character string, otherwise it throws a FormatException as you've noted. It is a rough, though more restrictive, equivalent of
public char string2char( string s )
{
return s[0] ;
}
Your code does the following:
Iterates over all the characters in some enumrable collection of characters.
For each such character, it...
Converts the char to an int. Hint: a char is an integral type: its an unsigned 16-bit integral value.
converts that value to a string containing a hex representation of the character in question. For most characters, that string will be at least two character in length: for instance, converting the space character (' ', 0x20) this way will give you the string "20".
You then try to convert that back to a char and replace the current item being iterated over. This is where your exception is thrown. One thing you should note here is that altering a collection being enumerated is likely to cause the enumerator to throw an exception.
What exactly are you trying to accomplish here. For instance, given a charMsg that consist of 3 characters, 'a', 'b' and 'c', what should happen. A clear problem statement helps us to help you.
Since printable unicode characters can be anywhere in range from 0x0000 to 0xFFFF, your hexOutput variable can hold more than one character - this is why error is thrown.
Convert.ToChar(string) would always check length a of string, and if it is not equal to 1 - it would throw. So it would not convert string 0x30 to hexadecimal number, and then to ascii representation, symbol 0.
Can you elaborate on what you are trying to archieve ?
Your hexOutput is a string, and I'm assuming charMsg is a character array. Suppose the first element in charMsg is 'p', or hex value 70. The documentation for Convert.ToChar(string) says it'll use just the first character of the string ('7'), but it's wrong. It'll throw this error. You can test this with a static example, like charMsg[n] = Convert.ToChar("70");. You'll get the same error.
Are you trying to replace characters with hex values? If so, you might try using a StringBuilder object instead of your array assignments.
Convert.ToChar(string) if it is empty string lead this error. instead use cchar()
I having a string variable which basically holds value of corresponding English word in the form of Chinese.
String temp = "'%1'不能输入步骤'%2'";
But when i want to know wether the string having %1 in it or not by using IndexOf function
if(temp.IndexOf("%1") != -1)
{
}
I am not getting true even if it contain %1.
So is there any issue due to Chinese charters or any thing else.
Pls suggest me how i can get the index of any charter in above case.
That is because %1 is not equal to %1 What you want to do in this case as workaround is select the symbols out of string you have like
var s = "'%1'不能输入步骤'%2'";
var firstFragment = s.Substring(1, 2); // this should select you %1
and then do
if(temp.IndexOf(first) != -1){
}
Comments gave the answer. Use the same percent character, so instead of:
"%1"
use:
"%1"
Or, if you find that problematic (your source code is in a "poor" code page, or you fear the code is hard to read when it contains full-width characters that resemble ASCII characters), use:
"\uFF051"
or even:
"\uFF05" + "1"
(concatenation will be done by the C# compiler, no extra concatting done at run-time).
Another approach might be Unicode normalization:
temp = temp.Normalize(NormalizationForm.FormKC);
which seems to project the "exotic" percent char into the usual ASCII percent char, although I am not sure if that behavior is guaranteed, but see the Decomposition field on Unicode Character 'FULLWIDTH PERCENT SIGN' (U+FF05).
I am using visual studio 2010 in c# for converting text into unicodes. Like i have a string abc= "मेरा" .
there are 4 characters in this string. i need all the four unicode characters.
Please help me.
When you write a code like string abc= "मेरा";, you already have it as Unicode (specifically, UTF-16), so you don't have to convert anything. If you want to access the singular characters, you can do that using normal index: e.g. abc[1] is े (DEVANAGARI VOWEL SIGN E).
If you want to see the numeric representations of those characters, just cast them to integers. For example
abc.Select(c => (int)c)
gives the sequence of numbers 2350, 2375, 2352, 2366. If you want to see the hexadecimal representation of those numbers, use ToString():
abc.Select(c => ((int)c).ToString("x4"))
returns the sequence of strings "092e", "0947", "0930", "093e".
Note that when I said numeric representations, I actually meant their encoding using UTF-16. For characters in the Basic Multilingual Plane, this is the same as their Unicode code point. The vast majority of used characters lie in BMP, including those 4 Hindi characters presented here.
If you wanted to handle characters in other planes too, you could use code like the following.
byte[] bytes = Encoding.UTF32.GetBytes(abc);
int codePointCount = bytes.Length / 4;
int[] codePoints = new int[codePointCount];
for (int i = 0; i < codePointCount; i++)
codePoints[i] = BitConverter.ToInt32(bytes, i * 4);
Since UTF-32 encodes all (21-bit) code points directly, this will give you them. (Maybe there is a more straightforward solution, but I haven't found one.)
Since a .Net char is a Unicode character (at least, for the BMP code point), you can simply enumerate all characters in a string:
var abc = "मेरा";
foreach (var c in abc)
{
Console.WriteLine((int)c);
}
resulting in
2350
2375
2352
2366
use
System.Text.Encoding.UTF8.GetBytes(abc)
that will return your unicode values.
If you are trying to convert files from a legacy encoding into Unicode:
Read the file, supplying the correct encoding of the source files, then write the file using the desired Unicode encoding scheme.
using (StreamReader reader = new StreamReader(#"C:\MyFile.txt", Encoding.GetEncoding("ISCII")))
using (StreamWriter writer = new StreamWriter(#"C:\MyConvertedFile.txt", false, Encoding.UTF8))
{
writer.Write(reader.ReadToEnd());
}
If you are looking for a mapping of Devanagari characters to the Unicode code points:
You can find the chart at the Unicode Consortium website here.
Note that Unicode code points are traditionally written in hexidecimal. So rather than the decimal number 2350, the code point would be written as U+092E, and it appears as 092E on the code chart.
If you have the string s = मेरा then you already have the answer.
This string contains four code points in the BMP which in UTF-16 are represented by 8 bytes. You can access them by index with s[i], with a foreach loop etc.
If you want the underlying 8 bytes you can access them as so:
string str = #"मेरा";
byte[] arr = System.Text.UnicodeEncoding.GetBytes(str);
I am getting a character from a emf record using Encoding.Unicode.GetString and the resulting string contains only one character but has two bytes. I don't have any idea about the encoding scheme and the multi byte character set. I want to convert that character to its equivalent single hexadecimal value.Can you help me regarding this..
It's not clear what you mean. A char in C# is a 16-bit unsigned value. If you've got a binary data source and you want to get Unicode characters, you should use an Encoding to decode the binary data into a string, that you can access as a sequence of char values.
You can convert a char to a hex string by first converting it to an integer, and then using the X format specifier like this:
char = '\u0123';
string hex = ((int)c).ToString("X4"); // Now hex = "0123"
Now, that leaves one more issue: surrogate pairs. Values which aren't in the Basic Multilingual Plane (U+0000 to U+FFFF) are represented by two UTF-16 code units - a high surrogate and a low surrogate. You can use the char.IsSurrogate* methods to check for surrogate pairs... although it's harder (as far as I can see) to then convert a surrogate pair into a UCS-4 value. If you're lucky, you won't need to deal with this... if you're happy converting your binary data into a sequence of UTF-16 code units instead of strict UCS-4 values, you don't need to worry.
EDIT: Given your comments, it's still not entirely clear what you've got to start with. You say you've got two bytes... are they separate, or in a byte array? What do they represent? Text in a particular encoding, presumably... but which encoding? Once you know the encoding, you can convert a byte array into a string easily:
byte[] bytes = ...;
// For example, if your binary data is UTF-8
string text = Encoding.UTF8.GetString(bytes);
char firstChar = text[0];
string hex = ((int)firstChar).ToString("X4");
If you could edit your question to give more details about your actual situation, it would be a lot easier to help you get to a solution. If you're generally confused about encodings and the difference between text and binary data, you might want to read my article about it.
Try this:
System.Text.Encoding.Unicode.GetBytes(theChar.ToString())
.Aggregate("", (agg, val) => agg + val.ToString("X2"));
However, since you don't specify exactly what encoding that the character is in, this could fail. Futher, you don't make it very clear if you want the output to be a string of hex chars or bytes. I'm guessing the former, since I'd guess you want to generate HTML. Let me know if any of this is wrong.
I created an extension method to convert unicode or non-unicode string to hex string.
I shared for whom concern.
public static class StringHelper
{
public static string ToHexString(this string str)
{
byte[] bytes = str.IsUnicode() ? Encoding.UTF8.GetBytes(str) : Encoding.Default.GetBytes(str);
return BitConverter.ToString(bytes).Replace("-", string.Empty);
}
public static bool IsUnicode(this string input)
{
const int maxAnsiCode = 255;
return input.Any(c => c > maxAnsiCode);
}
}
Get thee to StringInfo:
http://msdn.microsoft.com/en-us/library/system.globalization.stringinfo.aspx
http://msdn.microsoft.com/en-us/library/8k5611at.aspx
The .NET Framework supports text elements. A text element is a unit of text that is displayed as a single character, called a grapheme. A text element can be a base character, a surrogate pair, or a combining character sequence. The StringInfo class provides methods that allow your application to split a string into its text elements and iterate through the text elements. For an example of using the StringInfo class, see String Indexing.